Analiza danych
Analiza danych jest ekscytującą dyscypliną, która umożliwia zrozumienie pewnych zjawisk, uzyskanie wglądu i wiedzy na podstawie surowych danych. Pojęcie to oznacza dokładnie przetwarzanie danych za pomocą technik matematycznych i statystycznych w celu uzyskania cennych wniosków, podjęcia ważnych decyzji i opracowania przydatnych produktów. Termin ten wywodzi się od angielskiego data science, często traktowanego jako synonim takich terminów, jak analityka biznesowa, badania operacyjne, business intelligence, wywiad konkurencyjny, analiza i modelowanie danych, a także pozyskiwanie wiedzy. Dzięki takim technologiom, jak języki Python czy R, platformy Hadoop i Spark masz szansę wyciągnąć maksimum wniosków, dostrzec szanse na rozwój swojej organizacji albo przewidzieć i zapobiec zagrożeniom.
De Mauro
Data Analytics Made Easy is an accessible beginner’s guide for anyone working with data. The book interweaves four key elements:Data visualizations and storytelling – Tired of people not listening to you and ignoring your results? Don’t worry; chapters 7 and 8 show you how to enhance your presentations and engage with your managers and co-workers. Learn to create focused content with a well-structured story behind it to captivate your audience.Automating your data workflows – Improve your productivity by automating your data analysis. This book introduces you to the open-source platform, KNIME Analytics Platform. You’ll see how to use this no-code and free-to-use software to create a KNIME workflow of your data processes just by clicking and dragging components.Machine learning – Data Analytics Made Easy describes popular machine learning approaches in a simplified and visual way before implementing these machine learning models using KNIME. You’ll not only be able to understand data scientists’ machine learning models; you’ll be able to challenge them and build your own.Creating interactive dashboards – Follow the book’s simple methodology to create professional-looking dashboards using Microsoft Power BI, giving users the capability to slice and dice data and drill down into the results.
Data Analytics. Master the Art of Data Analytics with Essential Tools and Techniques
Mercury Learning and Information, Christopher Greco
Data analytics is becoming increasingly important in our daily lives. This book offers a comprehensive view of data analytics skills, starting with a primer on statistics and progressing to the application of these methods. The text includes various formulas and algorithms used in data analytics, which can be applied in any software to achieve desired results. Through numerous demonstrations, it provides clear instruction on how to incorporate data analytics into critical thinking.The book covers a range of methods and techniques, supplemented with case studies specific to project managers, systems engineers, and cybersecurity professionals. Each profession can practice data analytics relevant to their fields. The main objective is to refresh statistical knowledge necessary for building data analytics models and to foster analytical thinking essential across these professions.From introducing statistics and data to reviewing central tendency measures and probability, the book moves to more complex topics like effect size, analysis methods, and data presentation. By the end of the course, readers will be well-versed in data analytics, ready to apply these skills effectively in their respective fields, enhancing decision-making and analytical thinking.
Dr. Nadine Shillingford
Splunk 9 improves on the existing Splunk tool to include important features such as federated search, observability, performance improvements, and dashboarding. This book helps you to make the best use of the impressive and new features to prepare a Splunk installation that can be employed in the data analysis process.Starting with an introduction to the different Splunk components, such as indexers, search heads, and forwarders, this Splunk book takes you through the step-by-step installation and configuration instructions for basic Splunk components using Amazon Web Services (AWS) instances. You’ll import the BOTS v1 dataset into a search head and begin exploring data using the Splunk Search Processing Language (SPL), covering various types of Splunk commands, lookups, and macros. After that, you’ll create tables, charts, and dashboards using Splunk’s new Dashboard Studio, and then advance to work with clustering, container management, data models, federated search, bucket merging, and more.By the end of the book, you’ll not only have learned everything about the latest features of Splunk 9 but also have a solid understanding of the performance tuning techniques in the latest version.
Michael Walker
Many individuals who know how to run machine learning algorithms do not have a good sense of the statistical assumptions they make and how to match the properties of the data to the algorithm for the best results.As you start with this book, models are carefully chosen to help you grasp the underlying data, including in-feature importance and correlation, and the distribution of features and targets. The first two parts of the book introduce you to techniques for preparing data for ML algorithms, without being bashful about using some ML techniques for data cleaning, including anomaly detection and feature selection. The book then helps you apply that knowledge to a wide variety of ML tasks. You’ll gain an understanding of popular supervised and unsupervised algorithms, how to prepare data for them, and how to evaluate them. Next, you’ll build models and understand the relationships in your data, as well as perform cleaning and exploration tasks with that data. You’ll make quick progress in studying the distribution of variables, identifying anomalies, and examining bivariate relationships, as you focus more on the accuracy of predictions in this book.By the end of this book, you’ll be able to deal with complex data problems using unsupervised ML algorithms like principal component analysis and k-means clustering.
Gus Frazer
Microsoft Power BI offers a range of powerful data cleaning and preparation options through tools such as DAX, Power Query, and the M language. However, despite its user-friendly interface, mastering it can be challenging. Whether you're a seasoned analyst or a novice exploring the potential of Power BI, this comprehensive guide equips you with techniques to transform raw data into a reliable foundation for insightful analysis and visualization.This book serves as a comprehensive guide to data cleaning, starting with data quality, common data challenges, and best practices for handling data. You’ll learn how to import and clean data with Query Editor and transform data using the M query language. As you advance, you’ll explore Power BI’s data modeling capabilities for efficient cleaning and establishing relationships. Later chapters cover best practices for using Power Automate for data cleaning and task automation. Finally, you’ll discover how OpenAI and ChatGPT can make data cleaning in Power BI easier.By the end of the book, you will have a comprehensive understanding of data cleaning concepts, techniques, and how to use Power BI and its tools for effective data preparation.
Jeff Burtenshaw
Domo is a power-packed business intelligence (BI) platform that empowers organizations to track, analyze, and activate data in record time at cloud scale and performance.Data Democratization with Domo begins with an overview of the Domo ecosystem. You’ll learn how to get data into the cloud with Domo data connectors and Workbench; profile datasets; use Magic ETL to transform data; work with in-memory data sculpting tools (Data Views and Beast Modes); create, edit, and link card visualizations; and create card drill paths using Domo Analyzer. Next, you’ll discover options to distribute content with real-time updates using Domo Embed and digital wallboards. As you advance, you’ll understand how to use alerts and webhooks to drive automated actions. You’ll also build and deploy a custom app to the Domo Appstore and find out how to code Python apps, use Jupyter Notebooks, and insert R custom models. Furthermore, you’ll learn how to use Auto ML to automatically evaluate dozens of models for the best fit using SageMaker and produce a predictive model as well as use Python and the Domo Command Line Interface tool to extend Domo. Finally, you’ll learn how to govern and secure the entire Domo platform.By the end of this book, you’ll have gained the skills you need to become a successful Domo master.
Data Engineering Best Practices. Architect robust and cost-effective data solutions in the cloud era
Richard J. Schiller, David Larochelle
Revolutionize your approach to data processing in the fast-paced business landscape with this essential guide to data engineering. Discover the power of scalable, efficient, and secure data solutions through expert guidance on data engineering principles and techniques. Written by two industry experts with over 60 years of combined experience, it offers deep insights into best practices, architecture, agile processes, and cloud-based pipelines. You’ll start by defining the challenges data engineers face and understand how this agile and future-proof comprehensive data solution architecture addresses them. As you explore the extensive toolkit, mastering the capabilities of various instruments, you’ll gain the knowledge needed for independent research. Covering everything you need, right from data engineering fundamentals, the guide uses real-world examples to illustrate potential solutions. It elevates your skills to architect scalable data systems, implement agile development processes, and design cloud-based data pipelines. The book further equips you with the knowledge to harness serverless computing and microservices to build resilient data applications.By the end, you'll be armed with the expertise to design and deliver high-performance data engineering solutions that are not only robust, efficient, and secure but also future-ready.
Manoj Kukreja
In the world of ever-changing data and schemas, it is important to build data pipelines that can auto-adjust to changes. This book will help you build scalable data platforms that managers, data scientists, and data analysts can rely on.Starting with an introduction to data engineering, along with its key concepts and architectures, this book will show you how to use Microsoft Azure Cloud services effectively for data engineering. You'll cover data lake design patterns and the different stages through which the data needs to flow in a typical data lake. Once you've explored the main features of Delta Lake to build data lakes with fast performance and governance in mind, you'll advance to implementing the lambda architecture using Delta Lake. Packed with practical examples and code snippets, this book takes you through real-world examples based on production scenarios faced by the author in his 10 years of experience working with big data. Finally, you'll cover data lake deployment strategies that play an important role in provisioning the cloud resources and deploying the data pipelines in a repeatable and continuous way.By the end of this data engineering book, you'll know how to effectively deal with ever-changing data and create scalable data pipelines to streamline data science, ML, and artificial intelligence (AI) tasks.