Big data

521
Ładowanie...
EBOOK

Learning Pandas. Get to grips with pandas - a versatile and high-performance Python library for data manipulation, analysis, and discovery

Michael Heydt

If you are a Python programmer who wants to get started with performing data analysis using pandas and Python, this is the book for you. Some experience with statistical analysis would be helpful but is not mandatory.

522
Ładowanie...
EBOOK

Learning pandas. High performance data manipulation and analysis using Python - Second Edition

Nicola Rainiero, Sonali Dayal, Michael Heydt

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance.With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.

523
Ładowanie...
EBOOK

Learning Pentaho Data Integration 8 CE. An end-to-end guide to exploring, transforming, and integrating your data across multiple sources - Third Edition

Dan Keeley, Diethard Steiner, María Carina Roldán,...

Pentaho Data Integration(PDI) is an intuitive and graphical environment packed with drag-and-drop design and powerful Extract-Tranform-Load (ETL) capabilities. This book shows and explains the new interactive features of Spoon, the revamped look and feel, and the newest features of the tool including transformations and jobs Executors and the invaluable Metadata Injection capability.We begin with the installation of PDI software and then move on to cover all the key PDI concepts. Each of the chapter introduces new features, enabling you to gradually get practicing with the tool. First, you will learn to do all kind of data manipulation and work with simple plain files. Then, the book teaches you how you can work with relational databases inside PDI. Moreover, you will be given a primer on data warehouse concepts and you will learn how to load data in a data warehouse. During the course of this book, you will be familiarized with its intuitive, graphical and drag-and-drop design environment.By the end of this book, you will learn everything you need to know in order to meet your data manipulation requirements. Besides, your will be given best practices and advises for designing and deploying your projects.

524
Ładowanie...
EBOOK

Learning Predictive Analytics with Python. Gain practical insights into predictive modelling by implementing Predictive Analytics algorithms on public datasets with Python

Ashish Kumar

Social Media and the Internet of Things have resulted in an avalanche of data. Data is powerful but not in its raw form - It needs to be processed and modeled, and Python is one of the most robust tools out there to do so. It has an array of packages for predictive modeling and a suite of IDEs to choose from. Learning to predict who would win, lose, buy, lie, or die with Python is an indispensable skill set to have in this data age. This book is your guide to getting started with Predictive Analytics using Python. You will see how to process data and make predictive models from it. We balance both statistical and mathematical concepts, and implement them in Python using libraries such as pandas, scikit-learn, and numpy. You’ll start by getting an understanding of the basics of predictive modeling, then you will see how to cleanse your data of impurities and get it ready it for predictive modeling. You will also learn more about the best predictive modeling algorithms such as Linear Regression, Decision Trees, and Logistic Regression. Finally, you will see the best practices in predictive modeling, as well as the different applications of predictive modeling in the modern world.

525
Ładowanie...
EBOOK

Learning Probabilistic Graphical Models in R. Familiarize yourself with probabilistic graphical models through real-world problems and illustrative code examples in R

David Bellot

Probabilistic graphical models (PGM, also known as graphical models) are a marriage between probability theory and graph theory. Generally, PGMs use a graph-based representation. Two branches of graphical representations of distributions are commonly used, namely Bayesian networks and Markov networks. R has many packages to implement graphical models.We’ll start by showing you how to transform a classical statistical model into a modern PGM and then look at how to do exact inference in graphical models. Proceeding, we’ll introduce you to many modern R packages that will help you to perform inference on the models. We will then run a Bayesian linear regression and you’ll see the advantage of going probabilistic when you want to do prediction. Next, you’ll master using R packages and implementing its techniques. Finally, you’ll be presented with machine learning applications that have a direct impact in many fields. Here, we’ll cover clustering and the discovery of hidden information in big data, as well as two important methods, PCA and ICA, to reduce the size of big problems.

526
Ładowanie...
EBOOK

Learning PySpark. Build data-intensive applications locally and deploy at scale using the combined powers of Python and Spark 2.0

Tomasz Drabas, Denny Lee

Apache Spark is an open source framework for efficient cluster computing with a strong interface for data parallelism and fault tolerance. This book will show you how to leverage the power of Python and put it to use in the Spark ecosystem. You will start by getting a firm understanding of the Spark 2.0 architecture and how to set up a Python environment for Spark. You will get familiar with the modules available in PySpark. You will learn how to abstract data with RDDs and DataFrames and understand the streaming capabilities of PySpark. Also, you will get a thorough overview of machine learning capabilities of PySpark using ML and MLlib, graph processing using GraphFrames, and polyglot persistence using Blaze. Finally, you will learn how to deploy your applications to the cloud using the spark-submit command. By the end of this book, you will have established a firm understanding of the Spark Python API and how it can be used to build data-intensive applications.

527
Ładowanie...
EBOOK

Learning QGIS, Third Edition. Create great maps and perform geoprocessing tasks with ease - Third Edition

Anita Graser

QGIS is a user-friendly open source geographic information system (GIS) that runs on Linux, Unix, Mac OS X, and Windows. The popularity of open source geographic information systems and QGIS in particular has been growing rapidly over the last few years.Learning QGIS Third Edition is a practical, hands-on guide updated for QGIS 2.14 that provides you with clear, step-by-step exercises to help you apply your GIS knowledge to QGIS. Through clear, practical exercises, this book will introduce you to working with QGIS quickly and painlessly.This book takes you from installing and configuring QGIS to handling spatial data to creating great maps. You will learn how to load and visualize existing spatialdata and create data from scratch. You will get to know important plugins, perform common geoprocessing and spatial analysis tasks and automate them with Processing.We will cover how to achieve great cartographic output and print maps. Finally, you will learn how to extend QGIS using Python and even create your own plugin.

528
Ładowanie...
EBOOK

Learning Quantitative Finance with R. Implement machine learning, time-series analysis, algorithmic trading and more

PRASHANT VATS, Dr. Param Jeet

The role of a quantitative analyst is very challenging, yet lucrative, so there is a lot of competition for the role in top-tier organizations and investment banks. This book is your go-to resource if you want to equip yourself with the skills required to tackle any real-world problem in quantitative finance using the popular R programming language.You'll start by getting an understanding of the basics of R and its relevance in the field of quantitative finance. Once you've built this foundation, we'll dive into the practicalities of building financialmodels in R. This will help you have a fair understanding of the topics as well as their implementation, as the authors have presented some use cases along with examples that are easy to understand and correlate.We'll also look at risk management and optimization techniques for algorithmic trading. Finally, the book will explain some advanced concepts, such as trading using machine learning, optimizations, exotic options, and hedging.By the end of this book, you will have a firm grasp of the techniques required to implement basic quantitative finance models in R.