Analiza danych
Analiza danych jest ekscytującą dyscypliną, która umożliwia zrozumienie pewnych zjawisk, uzyskanie wglądu i wiedzy na podstawie surowych danych. Pojęcie to oznacza dokładnie przetwarzanie danych za pomocą technik matematycznych i statystycznych w celu uzyskania cennych wniosków, podjęcia ważnych decyzji i opracowania przydatnych produktów. Termin ten wywodzi się od angielskiego data science, często traktowanego jako synonim takich terminów, jak analityka biznesowa, badania operacyjne, business intelligence, wywiad konkurencyjny, analiza i modelowanie danych, a także pozyskiwanie wiedzy. Dzięki takim technologiom, jak języki Python czy R, platformy Hadoop i Spark masz szansę wyciągnąć maksimum wniosków, dostrzec szanse na rozwój swojej organizacji albo przewidzieć i zapobiec zagrożeniom.
Michael Walker
Jumping into data analysis without proper data cleaning will certainly lead to incorrect results. The Python Data Cleaning Cookbook - Second Edition will show you tools and techniques for cleaning and handling data with Python for better outcomes.Fully updated to the latest version of Python and all relevant tools, this book will teach you how to manipulate and clean data to get it into a useful form. he current edition focuses on advanced techniques like machine learning and AI-specific approaches and tools for data cleaning along with the conventional ones. The book also delves into tips and techniques to process and clean data for ML, AI, and NLP models. You will learn how to filter and summarize data to gain insights and better understand what makes sense and what does not, along with discovering how to operate on data to address the issues you've identified. Next, you’ll cover recipes for using supervised learning and Naive Bayes analysis to identify unexpected values and classification errors and generate visualizations for exploratory data analysis (EDA) to identify unexpected values. Finally, you’ll build functions and classes that you can reuse without modification when you have new data.By the end of this Data Cleaning book, you'll know how to clean data and diagnose problems within it.
Alberto Boschetti, Luca Massaron
Fully expanded and upgraded, the latest edition of Python Data Science Essentials will help you succeed in data science operations using the most common Python libraries. This book offers up-to-date insight into the core of Python, including the latest versions of the Jupyter Notebook, NumPy, pandas, and scikit-learn.The book covers detailed examples and large hybrid datasets to help you grasp essential statistical techniques for data collection, data munging and analysis, visualization, and reporting activities. You will also gain an understanding of advanced data science topics such as machine learning algorithms, distributed computing, tuning predictive models, and natural language processing. Furthermore, You’ll also be introduced to deep learning and gradient boosting solutions such as XGBoost, LightGBM, and CatBoost.By the end of the book, you will have gained a complete overview of the principal machine learning algorithms, graph analysis techniques, and all the visualization and deployment instruments that make it easier to present your results to an audience of both data science experts and business users
Python Data Science Essentials. Learn the fundamentals of Data Science with Python - Second Edition
Alberto Boschetti, Luca Massaron
Fully expanded and upgraded, the second edition of Python Data Science Essentials takes you through all you need to know to suceed in data science using Python. Get modern insight into the core of Python data, including the latest versions of Jupyter notebooks, NumPy, pandas and scikit-learn. Look beyond the fundamentals with beautiful data visualizations with Seaborn and ggplot, web development with Bottle, and even the new frontiers of deep learning with Theano and TensorFlow. Dive into building your essential Python 3.5 data science toolbox, using a single-source approach that will allow to to work with Python 2.7 as well. Get to grips fast with data munging and preprocessing, and all the techniques you need to load, analyse, and process your data. Finally, get a complete overview of principal machine learning algorithms, graph analysis techniques, and all the visualization and deployment instruments that make it easier to present your results to an audience of both data science experts and business users.
Python Data Visualization Cookbook. Visualize data using Python's most popular libraries
Dimitry Foures, Giuseppe Vettigli, Tarek Amr, Igor...
Python Data Visualization Cookbook will progress the reader from the point of installing and setting up a Python environment for data manipulation and visualization all the way to 3D animations using Python libraries. Readers will benefit from over 60 precise and reproducible recipes that will guide the reader towards a better understanding of data concepts and the building blocks for subsequent and sometimes more advanced concepts.Python Data Visualization Cookbook starts by showing how to set up matplotlib and the related libraries that are required for most parts of the book, before moving on to discuss some of the lesser-used diagrams and charts such as Gantt Charts or Sankey diagrams. Initially it uses simple plots and charts to more advanced ones, to make it easy to understand for readers. As the readers will go through the book, they will get to know about the 3D diagrams and animations. Maps are irreplaceable for displaying geo-spatial data, so this book will also show how to build them. In the last chapter, it includes explanation on how to incorporate matplotlib into different environments, such as a writing system, LaTeX, or how to create Gantt charts using Python.
Ivan Vasilev, Daniel Slater, Gianmario Spacagna, Peter...
With the surge in artificial intelligence in applications catering to both business and consumer needs, deep learning is more important than ever for meeting current and future market demands. With this book, you’ll explore deep learning, and learn how to put machine learning to use in your projects.This second edition of Python Deep Learning will get you up to speed with deep learning, deep neural networks, and how to train them with high-performance algorithms and popular Python frameworks. You’ll uncover different neural network architectures, such as convolutional networks, recurrent neural networks, long short-term memory (LSTM) networks, and capsule networks. You’ll also learn how to solve problems in the fields of computer vision, natural language processing (NLP), and speech recognition. You'll study generative model approaches such as variational autoencoders and Generative Adversarial Networks (GANs) to generate images. As you delve into newly evolved areas of reinforcement learning, you’ll gain an understanding of state-of-the-art algorithms that are the main components behind popular games Go, Atari, and Dota.By the end of the book, you will be well-versed with the theory of deep learning along with its real-world applications.
Ivan Idris, Luiz Felipe Martins, Martin Czygan,...
Data analysis is the process of applying logical and analytical reasoning to study each component of data present in the system. Python is a multi-domain, high-level, programming language that offers a range of tools and libraries suitable for all purposes, it has slowly evolved as one of the primary languages for data science. Have you ever imagined becoming an expert at effectively approaching data analysis problems, solving them, and extracting all of the available information from your data? If yes, look no further, this is the course you need!In this course, we will get you started with Python data analysis by introducing the basics of data analysis and supported Python libraries such as matplotlib, NumPy, and pandas. Create visualizations by choosing color maps, different shapes, sizes, and palettes then delve into statistical data analysis using distribution algorithms and correlations. You’ll then find your way around different data and numerical problems, get to grips with Spark and HDFS, and set up migration scripts for web mining. You’ll be able to quickly and accurately perform hands-on sorting, reduction, and subsequent analysis, and fully appreciate how data analysis methods can support business decision-making. Finally, you will delve into advanced techniques such as performing regression, quantifying cause and effect using Bayesian methods, and discovering how to use Python’s tools for supervised machine learning.The course provides you with highly practical content explaining data analysis with Python, from the following Packt books:1. Getting Started with Python Data Analysis.2. Python Data Analysis Cookbook.3. Mastering Python Data Analysis.By the end of this course, you will have all the knowledge you need to analyze your data with varying complexity levels, and turn it into actionable insights.
Soledad Galli
Feature engineering, the process of transforming variables and creating features, albeit time-consuming, ensures that your machine learning models perform seamlessly. This second edition of Python Feature Engineering Cookbook will take the struggle out of feature engineering by showing you how to use open source Python libraries to accelerate the process via a plethora of practical, hands-on recipes.This updated edition begins by addressing fundamental data challenges such as missing data and categorical values, before moving on to strategies for dealing with skewed distributions and outliers. The concluding chapters show you how to develop new features from various types of data, including text, time series, and relational databases. With the help of numerous open source Python libraries, you'll learn how to implement each feature engineering method in a performant, reproducible, and elegant manner.By the end of this Python book, you will have the tools and expertise needed to confidently build end-to-end and reproducible feature engineering pipelines that can be deployed into production.
Jason Strimpel
Discover how Python has made algorithmic trading accessible to non-professionals with unparalleled expertise and practical insights from Jason Strimpel, founder of PyQuant News and a seasoned professional with global experience in trading and risk management. This book guides you through from the basics of quantitative finance and data acquisition to advanced stages of backtesting and live trading.Detailed recipes will help you leverage the cutting-edge OpenBB SDK to gather freely available data for stocks, options, and futures, and build your own research environment using lightning-fast storage techniques like SQLite, HDF5, and ArcticDB. This book shows you how to use SciPy and statsmodels to identify alpha factors and hedge risk, and construct momentum and mean-reversion factors. You’ll optimize strategy parameters with walk-forward optimization using VectorBT and construct a production-ready backtest using Zipline Reloaded. Implementing all that you’ve learned, you’ll set up and deploy your algorithmic trading strategies in a live trading environment using the Interactive Brokers API, allowing you to stream tick-level data, submit orders, and retrieve portfolio details.By the end of this algorithmic trading book, you'll not only have grasped the essential concepts but also the practical skills needed to implement and execute sophisticated trading strategies using Python.