Incluye índice. CONTENIDO: Gentle overview of big data and Spark. What is Apache Spark? -- A gentle introduction to Spark -- A tour of Spark's toolset -- Structured APIs : DataFrames, SQL, and datasets. Structured API overview -- Basic structured operations -- Working with different types of data -- Aggregations -- Joins -- Data sources -- Spark SQL -- Datasets -- Low-level APIs. Resilient distributed datasets (RDDs) -- Advanced RDDs -- Distributed shared variables -- Production applications. How Spark runs on a cluster -- Developint Spark applications -- Deploying Spark -- Monitoring and debugging -- Performance tuning -- Streaming. Stream processing fundamentals -- Structured streaming basics -- Event-time and stateful processing -- Structured streaming in production -- Advanced analytics and machine learning. Advanced analytics and machine learning overview -- Preprocessing and feature engineering -- Classification -- Regression -- Recommendation -- Unsupervised learning -- Graph analytics -- Deep learning -- Ecosystem. Language specifics : Python (PySpark) and R (SparkR and sparklyr) -- Ecosystem and community.