Analiza danych
Analiza danych jest ekscytującą dyscypliną, która umożliwia zrozumienie pewnych zjawisk, uzyskanie wglądu i wiedzy na podstawie surowych danych. Pojęcie to oznacza dokładnie przetwarzanie danych za pomocą technik matematycznych i statystycznych w celu uzyskania cennych wniosków, podjęcia ważnych decyzji i opracowania przydatnych produktów. Termin ten wywodzi się od angielskiego data science, często traktowanego jako synonim takich terminów, jak analityka biznesowa, badania operacyjne, business intelligence, wywiad konkurencyjny, analiza i modelowanie danych, a także pozyskiwanie wiedzy. Dzięki takim technologiom, jak języki Python czy R, platformy Hadoop i Spark masz szansę wyciągnąć maksimum wniosków, dostrzec szanse na rozwój swojej organizacji albo przewidzieć i zapobiec zagrożeniom.
Effective Business Intelligence with QuickSight. Boost your business IQ with Amazon QuickSight
Rajesh Nadipalli
Amazon QuickSight is the next-generation Business Intelligence (BI) cloud service that can help you build interactive visualizations on top of various data sources hosted on Amazon Cloud Infrastructure. QuickSight delivers responsive insights into big data and enables organizations to quickly democratize data visualizations and scale to hundreds of users at a fraction of the cost when compared to traditional BI tools.This book begins with an introduction to Amazon QuickSight, feature differentiators from traditional BI tools, and how it fits in the overall AWS big data ecosystem. With practical examples, you will find tips and techniques to load your data to AWS, prepare it, and finally visualize it using QuickSight. You will learn how to build interactive charts, reports, dashboards, and stories using QuickSight and share with others using just your browser and mobile app.The book also provides a blueprint to build a real-life big data project on top of AWS Data Lake Solution and demonstrates how to build a modern data lake on the cloud with governance, data catalog, and analysis. It reviews the current product shortcomings, features in the roadmap, and how to provide feedback to AWS.Grow your profits, improve your products, and beat your competitors.
Ankur Jain
Organizations are moving their applications, data, and processes to the cloud to reduce application costs, effort, and maintenance. However, adopting new technology poses challenges for developers, solutions architects, and designers due to a lack of knowledge and appropriate practical training resources. This book helps you get to grips with Oracle Visual Builder (VB) and enables you to quickly develop web and mobile applications and deploy them to production without hassle.This book will provide you with a solid understanding of VB so that you can adopt it at a faster pace and start building applications right away. After working with real-time examples to learn about VB, you'll discover how to design, develop, and deploy web and mobile applications quickly. You'll cover all the VB components in-depth, including web and mobile application development, business objects, and service connections. In order to use all these components, you'll also explore best practices, security, and recommendations, which are well explained within the chapters. Finally, this book will help you gain the knowledge you need to enhance the performance of an application before deploying it to production.By the end of this book, you will be able to work independently and deploy your VB applications efficiently and with confidence.
Ekstrakcja danych w Pythonie. Teoria i praktyka
Piotr Rybka
Dane: załaduj, przetwarzaj, analizuj Ekstrakcja danych jest procesem, w którym informacje pozyskuje się z różnych źródeł - zwykle po to, by następnie poddać je dalszej transformacji i analizie. Umiejętność pozyskiwania danych, scalania, filtrowania i obrabiania ich na rozmaite sposoby przydaje się nie tylko zawodowym analitykom. Zdolność poruszania się po świecie danych jest wysoce pożądana również u osób pracujących w działach IT i na stanowiskach menadżerskich. Kto ma dane, ten ma wiedzę i zyskuje przewagę nad konkurencją! Jeśli chcesz zgłębić teorię ekstrakcji danych i zdobyć praktyczne umiejętności pozwalające operować nimi w Pythonie, ten podręcznik powinien być dla Ciebie pozycją obowiązkową. Dzięki książce między innymi: Opanujesz podstawowe pojęcia, których znajomość jest niezbędna podczas działań na zbiorach danych Zrozumiesz specyfikę plików binarnych i tekstowych Dowiesz się, na czym polega kodowanie tekstu Poznasz zagadnienia wyrażeń regularnych Zorientujesz się, jakie formaty wymiany danych są dostępne w Pythonie Nauczysz się przeszukiwać dokumenty znacznikowe Zapoznasz się ze schematami formatów wymiany danych
Elasticsearch Indexing. How to Improve User's Search Experience
Huseyin Akdogan
Beginning with an overview of the way ElasticSearch stores data, you’ll begin to extend your knowledge to tackle indexing and mapping, and learn how to configure ElasticSearch to meet your users’ needs. You’ll then find out how to use analysis and analyzers for greater intelligence in how you organize and pull up search results – to guarantee that every search query is met with the relevant results! You’ll explore the anatomy of an ElasticSearch cluster, and learn how to set up configurations that give you optimum availability as well as scalability. Once you’ve learned how these elements work, you’ll find real-world solutions to help you improve indexing performance, as well as tips and guidance on safety so you can back up and restore data. Once you’ve learned each component outlined throughout, you will be confident that you can help to deliver an improved search experience – exactly what modern users demand and expect.
Elasticsearch 5.x Cookbook. Distributed Search and Analytics - Third Edition
Alberto Paro
Elasticsearch is a Lucene-based distributed search server that allows users to index and search unstructured content with petabytes of data. This book is your one-stop guide to master the complete Elasticsearch ecosystem. We’ll guide you through comprehensive recipes on what’s new in Elasticsearch 5.x, showing you how to create complex queries and analytics, and perform index mapping, aggregation, and scripting. Further on, you will explore the modules of Cluster and Node monitoring and see ways to back up and restore a snapshot of an index. You will understand how to install Kibana to monitor a cluster and also to extend Kibana for plugins. Finally, you will also see how you can integrate your Java, Scala, Python, and Big Data applications such as Apache Spark and Pig with Elasticsearch, and add enhanced functionalities with custom plugins.By the end of this book, you will have an in-depth knowledge of the implementation of the Elasticsearch architecture and will be able to manage data efficiently and effectively with Elasticsearch.
Anurag Srivastava, Douglas Miller
Elasticsearch is one of the most popular tools for distributed search and analytics. This Elasticsearch book highlights the latest features of Elasticsearch 7 and helps you understand how you can use them to build your own search applications with ease.Starting with an introduction to the Elastic Stack, this book will help you quickly get up to speed with using Elasticsearch. You'll learn how to install, configure, manage, secure, and deploy Elasticsearch clusters, as well as how to use your deployment to develop powerful search and analytics solutions. As you progress, you'll also understand how to troubleshoot any issues that you may encounter along the way. Finally, the book will help you explore the inner workings of Elasticsearch and gain insights into queries, analyzers, mappings, and aggregations as you learn to work with search results.By the end of this book, you'll have a basic understanding of how to build and deploy effective search and analytics solutions using Elasticsearch.
Bharvi Dixit
With constantly evolving and growing datasets, organizations have the need to find actionable insights for their business. ElasticSearch, which is the world's most advanced search and analytics engine, brings the ability to make massive amounts of data usable in a matter of milliseconds. It not only gives you the power to build blazing fast search solutions over a massive amount of data, but can also serve as a NoSQL data store.This guide will take you on a tour to become a competent developer quickly with a solid knowledge level and understanding of the ElasticSearch core concepts. Starting from the beginning, this book will cover these core concepts, setting up ElasticSearch and various plugins, working with analyzers, and creating mappings. This book provides complete coverage of working with ElasticSearch using Python and performing CRUD operations and aggregation-based analytics, handling document relationships in the NoSQL world, working with geospatial data, and taking data backups. Finally, we’ll show you how to set up and scale ElasticSearch clusters in production environments as well as providing some best practices.
Marek Rogozinski, Rafal Kuc
ElasticSearch is a very fast and scalable open source search engine, designed with distribution and cloud in mind, complete with all the goodies that Apache Lucene has to offer. ElasticSearch’s schema-free architecture allows developers to index and search unstructured content, making it perfectly suited for both small projects and large big data warehouses, even those with petabytes of unstructured data.This book will guide you through the world of the most commonly used ElasticSearch server functionalities. You’ll start off by getting an understanding of the basics of ElasticSearch and its data indexing functionality. Next, you will see the querying capabilities of ElasticSearch, followed by a through explanation of scoring and search relevance. After this, you will explore the aggregation and data analysis capabilities of ElasticSearch and will learn how cluster administration and scaling can be used to boost your application performance. You’ll find out how to use the friendly REST APIs and how to tune ElasticSearch to make the most of it. By the end of this book, you will have be able to create amazing search solutions as per your project’s specifications.
Nicolae Tarla
Power Virtual Agents is a set of technologies released under the Power Platform umbrella by Microsoft. It allows non-developers to create solutions to automate customer interactions and provide services using a conversational interface, thus relieving the pressure on front-line staff providing this kind of support.Empowering Organizations with Power Virtual Agents is a guide to building chatbots that can be deployed to handle front desk services without having to write code. The book takes a scenario-based approach to implementing bot services and automation to serve employees in the organization and external customers. You will uncover the features available in Power Virtual Agents for creating bots that can be integrated into an organization’s public site as well as specific web pages. Next, you will understand how to build bots and integrate them within the Teams environment for internal users. As you progress, you will explore complete examples for implementing automated agents (bots) that can be deployed on sites for interacting with external customers.By the end of this Power Virtual Agents chatbot book, you will have implemented several scenarios to serve external client requests for information, created scenarios to help internal users retrieve relevant information, and processed these in an automated conversational manner.
Matt Benatan, Jochem Gietema, Marian Schneider
Deep learning has an increasingly significant impact on our lives, from suggesting content to playing a key role in mission- and safety-critical applications. As the influence of these algorithms grows, so does the concern for the safety and robustness of the systems which rely on them. Simply put, typical deep learning methods do not know when they don’t know.The field of Bayesian Deep Learning contains a range of methods for approximate Bayesian inference with deep networks. These methods help to improve the robustness of deep learning systems as they tell us how confident they are in their predictions, allowing us to take more in how we incorporate model predictions within our applications.Through this book, you will be introduced to the rapidly growing field of uncertainty-aware deep learning, developing an understanding of the importance of uncertainty estimation in robust machine learning systems. You will learn about a variety of popular Bayesian Deep Learning methods, and how to implement these through practical Python examples covering a range of application scenarios.By the end of the book, you will have a good understanding of Bayesian Deep Learning and its advantages, and you will be able to develop Bayesian Deep Learning models for safer, more robust deep learning systems.
Ryan Doan
The rapid advancements in large language models (LLMs) bring significant challenges in deployment, maintenance, and scalability. This Essential Guide to LLMOps provides practical solutions and strategies to overcome these challenges, ensuring seamless integration and the optimization of LLMs in real-world applications.This book takes you through the historical background, core concepts, and essential tools for data analysis, model development, deployment, maintenance, and governance. You’ll learn how to streamline workflows, enhance efficiency in LLMOps processes, employ LLMOps tools for precise model fine-tuning, and address the critical aspects of model review and governance. You’ll also get to grips with the practices and performance considerations that are necessary for the responsible development and deployment of LLMs. The book equips you with insights into model inference, scalability, and continuous improvement, and shows you how to implement these in real-world applications.By the end of this book, you’ll have learned the nuances of LLMOps, including effective deployment strategies, scalability solutions, and continuous improvement techniques, equipping you to stay ahead in the dynamic world of AI.
Sreeram Nudurupati
Apache Spark is a unified data analytics engine designed to process huge volumes of data quickly and efficiently. PySpark is Apache Spark's Python language API, which offers Python developers an easy-to-use scalable data analytics framework.Essential PySpark for Scalable Data Analytics starts by exploring the distributed computing paradigm and provides a high-level overview of Apache Spark. You'll begin your analytics journey with the data engineering process, learning how to perform data ingestion, cleansing, and integration at scale. This book helps you build real-time analytics pipelines that help you gain insights faster. You'll then discover methods for building cloud-based data lakes, and explore Delta Lake, which brings reliability to data lakes. The book also covers Data Lakehouse, an emerging paradigm, which combines the structure and performance of a data warehouse with the scalability of cloud-based data lakes. Later, you'll perform scalable data science and machine learning tasks using PySpark, such as data preparation, feature engineering, and model training and productionization. Finally, you'll learn ways to scale out standard Python ML libraries along with a new pandas API on top of PySpark called Koalas.By the end of this PySpark book, you'll be able to harness the power of PySpark to solve business problems.
Rongpeng Li
Statistics remain the backbone of modern analysis tasks, helping you to interpret the results produced by data science pipelines. This book is a detailed guide covering the math and various statistical methods required for undertaking data science tasks.The book starts by showing you how to preprocess data and inspect distributions and correlations from a statistical perspective. You’ll then get to grips with the fundamentals of statistical analysis and apply its concepts to real-world datasets. As you advance, you’ll find out how statistical concepts emerge from different stages of data science pipelines, understand the summary of datasets in the language of statistics, and use it to build a solid foundation for robust data products such as explanatory models and predictive models. Once you’ve uncovered the working mechanism of data science algorithms, you’ll cover essential concepts for efficient data collection, cleaning, mining, visualization, and analysis. Finally, you’ll implement statistical methods in key machine learning tasks such as classification, regression, tree-based methods, and ensemble learning.By the end of this Essential Statistics for Non-STEM Data Analysts book, you’ll have learned how to build and present a self-contained, statistics-backed data product to meet your business goals.
Ethereum Projects for Beginners. Build blockchain-based cryptocurrencies, smart contracts, and DApps
Kenny Vaneetvelde
Ethereum enables the development of efficient, smart contracts that contain code. These smart contracts can interact with other smart contracts to make decisions, store data, and send Ether to others.Ethereum Projects for Beginners provides you with a clear introduction to creating cryptocurrencies, smart contracts, and decentralized applications. As you make your way through the book, you’ll get to grips with detailed step-by-step processes to build advanced Ethereum projects. Each project will teach you enough about Ethereum to be productive right away. You will learn how tokenization works, think in a decentralized way, and build blockchain-based distributed computing systems. Towards the end of the book, you will develop interesting Ethereum projects such as creating wallets and secure data sharing.By the end of this book, you will be able to tackle blockchain challenges by implementing end-to-end projects using the full power of the Ethereum blockchain.
Mayukh Mukhopadhyay
Ethereum is a public, blockchain-based distributed computing platform featuring smart contract functionality. This book is your one-stop guide to blockchain and Ethereum smart contract development. We start by introducing you to the basics of blockchain. You'll learn about hash functions, Merkle trees, forking, mining, and much more. Then you'll learn about Ethereum and smart contracts, and we'll cover Ethereum virtual machine (EVM) in detail. Next, you'll get acquainted with DApps and DAOs and see how they work. We'll also delve into the mechanisms of advanced smart contracts, taking a practical approach.You'll also learn how to develop your own cryptocurrency from scratch in order to understand the business behind ICO. Further on, you'll get to know the key concepts of the Solidity programming language, enabling you to build decentralized blockchain-based applications. We'll also look at enterprise use cases, where you'll build a decentralized microblogging site. At the end of this book, we discuss blockchain-as-a-service, the dark web marketplace, and various advanced topics so you can get well versed with the blockchain principles and ecosystem.
EU General Data Protection Regulation (GDPR). An implementation and compliance guide
IT Governance Publishing, IT Governance Privacy Team
This book provides a thorough exploration of the EU General Data Protection Regulation (GDPR). It starts with the core principles of GDPR, explaining its purpose, key concepts, and how it impacts data controllers and processors. The book covers essential features like data subject rights, data processing principles, and privacy compliance frameworks. It also explores the role of the Data Protection Officer (DPO) and the importance of conducting data protection impact assessments (DPIAs).Focusing on practical implementation, the book highlights the need for robust information security measures to meet GDPR standards. It provides actionable advice on best practices, including managing data breaches, ensuring lawful consent, and processing subject access requests. The guide also addresses the complexities of international data transfers in line with GDPR requirements.Finally, the book outlines GDPR enforcement mechanisms, detailing the powers of supervisory authorities and the steps to demonstrate compliance. This resource offers organizations a comprehensive roadmap to align with GDPR, laying the groundwork for effective data protection and compliance.