Bazy danych

481

Ebook

R: Recipes for Analysis, Visualization and Machine Learning. Click here to enter text

Shanthi Viswanathan, Atmajitsinh Gohil, Viswa Viswanathan, Yu-Wei, ...

The R language is a powerful, open source, functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics. This Learning Path is chock-full of recipes. Literally! It aims to excite you with awesome projects focused on analysis, visualization, and machine learning. We’ll start off with data analysis – this will show you ways to use R to generate professional analysis reports. We’ll then move on to visualizing our data – this provides you with all the guidance needed to get comfortable with data visualization with R. Finally, we’ll move into the world of machine learning – this introduces you to data classification, regression, clustering, association rule mining, and dimension reduction.This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:• R Data Analysis Cookbook by Viswa Viswanathan and Shanthi Viswanathan• R Data Visualization Cookbook by Atmajitsinh Gohil• Machine Learning with R Cookbook by Yu-Wei, Chiu (David Chiu)

482

Ebook

R Web Scraping Quick Start Guide. Techniques and tools to crawl and scrape data from websites

Olgun Aydin

Web scraping is a technique to extract data from websites. It simulates the behavior of a website user to turn the website itself into a web service to retrieve or introduce new data. This book gives you all you need to get started with scraping web pages using R programming.You will learn about the rules of RegEx and Xpath, key components for scraping website data. We will show you web scraping techniques, methodologies, and frameworks. With this book's guidance, you will become comfortable with the tools to write and test RegEx and XPath rules. We will focus on examples of dynamic websites for scraping data and how to implement the techniques learned. You will learn how to collect URLs and then create XPath rules for your first web scraping script using rvest library. From the data you collect, you will be able to calculate the statistics and create R plots to visualize them. Finally, you will discover how to use Selenium drivers with R for more sophisticated scraping. You will create AWS instances and use R to connect a PostgreSQL database hosted on AWS. By the end of the book, you will be sufficiently confident to create end-to-end web scraping systems using R.

483

Ebook

Raportowanie w System Center Configuration Manager Bez tajemnic

Garth Jones, Dan Toll, Kerrie Meyler

Baza danych SQL Server programu Microsoft System Center Configuration Manager (ConfigMgr) zawiera wiele cennych informacji na temat Twoich użytkowników, komputerów, sprzętu, systemów operacyjnych, aplikacji czy stanu zgodności. Aby umożliwić Ci efektywne wyodrębnianie tych danych, Microsoft dostarczył kilku doskonałych narzędzi, wliczając w to usługi raportowania SQL Server Reporting Services (SSRS) i dodatek SQL Server Data Tools Business Intelligence (SSDT-BI). Podręcznik Raportowanie w System Center Configuration Manager bez tajemnic pokaże Ci, w jaki sposób możesz wykorzystać maksymalny potencjał tych narzędzi. Światowej sławy guru raportowania, Garth Jones, wraz z będącymi ekspertami współautorami tego przewodnika poprowadzi Cię przez wszystkie aspekty niestandardowego raportowania w System Center. Poczynając od instalacji i konfiguracji usług SSRS, krok po kroku nauczysz się wykorzystywać widoki języka SQL do wyszukiwania potrzebnych Ci danych, budować zapytania SQL, tworzyć proste i zaawansowane raporty, a także wykorzystywać administrację opartą na rolach do bezpiecznego dostarczania tych raportów właściwym osobom. W książce tej Jones zebrał aktualne, niezawodne i wszechstronne techniki raportowania w System Center, których na próżno szukać w innych podręcznikach i witrynach internetowych. Korzystając z tego przewodnika będziesz w stanie konsekwentnie pozyskiwać właściwe informacje, które pozwolą Ci rozwiązywać palące problemy i szybko reagować na ewentualne obawy zarządu. Garth Jones, główny architekt w Enhansoft i Microsoft MVP, specjalizuje się w poszerzaniu wartości i znaczenia programu System Center Configuration Manager. Z rodziną produktów System Center pracuje od roku 1996, kiedy to występowała jeszcze pod nazwą SMS. Dan Toll jest administratorem programu Configuration Manager, z którym pracuje od wersji SMS 2003. Specjalizuje się we wdrożeniach systemów operacyjnych dla stacji roboczych i serwerów przy użyciu narzędzi Microsoft Deployment Toolkit (MDT) oraz w raportowaniu w programie ConfigMgr. Kerrie Meyler, Microsoft MVP, jest wiodącą autorką wielu książek z serii System Center Unleashed. Obecnie pracuje jako niezależny konsultant. W czasie trwającej ponad 17 lat kariery zawodowej ewangelizowała produkt SMS na stanowisku starszego specjalisty technologii w Microsoft i prezentowała technologie System Center na konferencjach TechEd i MMS. Szczegółowe informacje na temat Instalowania i konfigurowania usług SSRS pod kątem optymalnego raportowania w System Center i łatwiej-szego rozwiązywania problemów Danych przechowywanych w bazie lokacji programu ConfigMgr Wydajnego pozyskiwania danych programu ConfigMgr poprzez tworzenie zapytań SQL z poziomu SQL Server Management Studio Najlepszych praktyk w zakresie tworzenia i projektowania raportów w System Center Tworzenia szablonów raportów, dostosowywania treści z użyciem parametrów raportów oraz zagnieżdżania wykresów Dostosowywania logo, palet kolorów i pozostałych elementów raportów na potrzeby konkretnej organizacji Konstruowania zaawansowanych metod przeglądania szczegółowego w celu dostarczenia dodatkowych informacji Wzmacniania zabezpieczeń raportów poprzez integrowanie administracji programu ConfigMgr opartej na rolach w zapytaniach SQL Wykorzystywania raportowania do pomiaru kluczowych wskaźników wydajności i pogłębiania wiedzy na temat własnego środowiska Dostosowywania raportów do potrzeb użytkowników końcowych lub zarządu W SIECI: Wszystkie zaprezentowane w tej książce przykłady i skrypty dostępne są do pobrania na stronie informit.com/title/9780672337789

484

Ebook

Raspberry Pi Server Essentials. If you want to use Raspberry Pi as a server, this is the book that makes it all possible. Covering a wide range of projects – from network storage to a game server – you’ll learn in easy, engaging steps

Piotr J Kula

485

Ebook

RavenDB 2.x Beginner's Guide. For .NET developers who want to acquire document-oriented database skills, there is no better introduction to RavenDB than this book. It covers all the bases in a user-friendly style that makes learning fast and easy

Khaled Tannir

RavenDB is a second generation document database written in .NET, offering a flexible data model designed to address requirements coming from real-world systems. It is different from the other document databases around, as with RavenDB you can get up and running in a few minutes, and that includes grasping all the basics. It allows you to build high-performance, low-latency applications with ease and efficiency.RavenDB 2.x Beginner's Guide introduces RavenDB concepts and teaches you everything, right from installing RavenDB, to creating documents, and querying indexes. This book will help you take advantage of powerful, document-oriented NoSQL databases and build a solid foundation on which you can create your .NET applications.This book presents RavenDB, the .NET document-oriented NoSQL database, through a series of clear and practical exercises that will help you to take advantage of this database server.The book starts off with an introduction to RavenDB and its Management Studio. You will then move ahead and learn how to quickly and efficiently build high performance, NoSQL document-oriented .NET applications using the .NET client API or the HTTP REST API. Next, Dynamic and static indexes that use map/reduce to process datasets are covered. You will then see how to create and query these indexes, with the help of detailed examples. You will also learn how to deploy your RavenDB server in a production environment and how to optimize and secure it.With numerous practical examples, RavenDB 2.x Beginner's Guide teaches you everything you need to know for building high performance .NET document-oriented NoSQL databases.

486

Ebook

Real-Time Big Data Analytics. Design, process, and analyze large sets of complex data in real time

Sumit Gupta, Shilpi Saxena

Enterprise has been striving hard to deal with the challenges of data arriving in real time or near real time.Although there are technologies such as Storm and Spark (and many more) that solve the challenges of real-time data, using the appropriate technology/framework for the right business use case is the key to success. This book provides you with the skills required to quickly design, implement and deploy your real-time analytics using real-world examples of big data use cases.From the beginning of the book, we will cover the basics of varied real-time data processing frameworks and technologies. We will discuss and explain the differences between batch and real-time processing in detail, and will also explore the techniques and programming concepts using Apache Storm.Moving on, we’ll familiarize you with “Amazon Kinesis” for real-time data processing on cloud. We will further develop your understanding of real-time analytics through a comprehensive review of Apache Spark along with the high-level architecture and the building blocks of a Spark program. You will learn how to transform your data, get an output from transformations, and persist your results using Spark RDDs, using an interface called Spark SQL to work with Spark.At the end of this book, we will introduce Spark Streaming, the streaming library of Spark, and will walk you through the emerging Lambda Architecture (LA), which provides a hybrid platform for big data processing by combining real-time and precomputed batch data to provide a near real-time view of incoming data.

487

Ebook

Real-World Implementation of C# Design Patterns. Overcome daily programming challenges using elements of reusable object-oriented software

Bruce M. Van Horn II, Van Symons

As a software developer, you need to learn new languages and simultaneously get familiarized with the programming paradigms and methods of leveraging patterns, as both a communications tool and an advantage when designing well-written, easy-to-maintain code. Design patterns, being a collection of best practices, provide the necessary wisdom to help you overcome common sets of challenges in object-oriented design and programming.This practical guide to design patterns helps C# developers put their programming knowledge to work. The book takes a hands-on approach to introducing patterns and anti-patterns, elaborating on 14 patterns along with their real-world implementations. Throughout the book, you'll understand the implementation of each pattern, as well as find out how to successfully implement those patterns in C# code within the context of a real-world project.By the end of this design patterns book, you’ll be able to recognize situations that tempt you to reinvent the wheel, and quickly avoid the time and cost associated with solving common and well-understood problems with battle-tested design patterns.

488

Ebook

Redis 4.x Cookbook. Over 80 hand-picked recipes for effective Redis development and administration

Pengcheng Huang, Zuofei Wang

Redis is considered the world's most popular key-value store database. Its versatility and the wide variety of use cases it enables have made it a popular choice of database for many enterprises. Based on the latest version of Redis, this book provides both step-by-step recipes and relevant the background information required to utilize its features to the fullest. It covers everything from a basic understanding of Redis data types to advanced aspects of Redis high availability, clustering, administration, and troubleshooting. This book will be your great companion to master all aspects of Redis.The book starts off by installing and configuring Redis for you to get started with ease. Moving on, all the data types and features of Redis are introduced in detail. Next, you will learn how to develop applications with Redis in Java, Python, and the Spring Boot web framework. You will also learn replication tasks, which will help you to troubleshoot replication issues. Furthermore, you will learn the steps that need to be undertaken to ensure high availability on your cluster and during production deployment. Toward the end of the book, you will learn the topmost tasks that will help you to troubleshoot your ecosystem efficiently, along with extending Redis by using different modules.

489

Ebook

Redis Stack for Application Modernization. Build real-time multi-model applications at any scale with Redis

Luigi Fugaro, Mirko Ortensi

In modern applications, efficiency in both operational and analytical aspects is paramount, demanding predictable performance across varied workloads. This book introduces you to Redis Stack, an extension of Redis and guides you through its broad data modeling capabilities. With practical examples of real-time queries and searches, you’ll explore Redis Stack’s new approach to providing a rich data modeling experience all within the same database server.You’ll learn how to model and search your data in the JSON and hash data types and work with features such as vector similarity search, which adds semantic search capabilities to your applications to search for similar texts, images, or audio files. The book also shows you how to use the probabilistic Bloom filters to efficiently resolve recurrent big data problems. As you uncover the strengths of Redis Stack as a data platform, you’ll explore use cases for managing database events and leveraging introduce stream processing features. Finally, you’ll see how Redis Stack seamlessly integrates into microservices architectures, completing the picture.By the end of this book, you’ll be equipped with best practices for administering and managing the server, ensuring scalability, high availability, data integrity, stored functions, and more.

490

Ebook

Reporting with Microsoft SQL Server 2012. Learn to quickly create reports in SSRS and Power View as well as understand the best use of each reporting tool

James Serra, William Anton

491

Ebook

Responsible AI in the Enterprise. Practical AI risk management for explainable, auditable, and safe models with hyperscalers and Azure OpenAI

Adnan Masood, Heather Dawe, Ed Price, Dr. Ehsan Adeli

Responsible AI in the Enterprise is a comprehensive guide to implementing ethical, transparent, and compliant AI systems in an organization. With a focus on understanding key concepts of machine learning models, this book equips you with techniques and algorithms to tackle complex issues such as bias, fairness, and model governance. Throughout the book, you’ll gain an understanding of FairLearn and InterpretML, along with Google What-If Tool, ML Fairness Gym, IBM AI 360 Fairness tool, and Aequitas. You’ll uncover various aspects of responsible AI, including model interpretability, monitoring and management of model drift, and compliance recommendations. You’ll gain practical insights into using AI governance tools to ensure fairness, bias mitigation, explainability, privacy compliance, and privacy in an enterprise setting. Additionally, you’ll explore interpretability toolkits and fairness measures offered by major cloud AI providers like IBM, Amazon, Google, and Microsoft, while discovering how to use FairLearn for fairness assessment and bias mitigation. You’ll also learn to build explainable models using global and local feature summary, local surrogate model, Shapley values, anchors, and counterfactual explanations.By the end of this book, you’ll be well-equipped with tools and techniques to create transparent and accountable machine learning models.

492

Ebook

Ripple Quick Start Guide. Get started with XRP and develop applications on Ripple's blockchain

Febin John James

This book starts by giving you an understanding of the basics of blockchain and the Ripple protocol. You will then get some hands-on experience of working with XRP.You will learn how to set up a Ripple wallet and see how seamlessly you can transfer money abroad. You will learn about different types of wallets through which you can store and transact XRP, along with the security precautions you need to take to keep your money safe.Since Ripple is currency agnostic, it can enable the transfer of value in USD, EUR, and any other currency. You can even transfer digital assets using Ripple. You will see how you can pay an international merchant with their own native currency and how Ripple can exchange it on the ?y. Once you understand the applications of Ripple, you will learn how to create a conditionally-held escrow using the Ripple API, and how to send and cash checks.Finally, you will also understand the common misconceptions people have about Ripple and discover the potential risks you must consider before making investment decisions.By the end of this book, you will have a solid foundation for working with Ripple's blockchain. Using it, you will be able to solve problems caused by traditional systems in your respective industry.

493

Ebook

Rola archiwów w procesie wdrażania systemów elektronicznego zarządzania dokumentacją. Z doświadczeń archiwów szkół wyższych, instytucji naukowych i kulturalnych oraz państwowych i samorządowych jednostek organizacyjnych

red. Antoni Barciak, Dorota Drzewiecka, Katarzyna Pepłowska

Książka omawia trudny proces jakim jest wdrażanie systemów EZD w działalności jednostek organizacyjnych w kontekście informatyzacji państwa. W literaturze naukowej coraz więcej miejsca poświęca się tematyce projektowania i wdrażania systemów do elektronicznego zarządzania dokumentacją. Niestety zbyt mało mówi się o udziale archiwistów, często można odnieść wrażanie, że są oni pomijani w tym ważnym procesie. Z drugiej strony, należy przypomnieć, że wiedza, którą dysponują archiwiści w wielu kwestiach związanych z zarządzaniem dokumentacją pozwoliłaby uniknąć licznych problemów występujących w praktyce. Niniejsza książka wypełnią tę lukę, bowiem koncentruje się na doświadczeniach archiwów i ich roli w procesie wdrażania EZD. Książka jest szerokim spojrzeniem na działalność archiwów. Jest adresowana archiwistom, pracownikom jednostek organizacyjnych, dysponentom oraz studentom. Zawiera w sobie praktyczne wyjaśnienie procesów wdrażania EZD dzięki czemu może stanowić cenne źródło wiedzy dla wszystkich, którzy obecnie zmagają z problemem EZD.

494

Ebook

Salesforce Data Architect Certification Guide. Comprehensive coverage of the Salesforce Data Architect exam content to help you pass on the first attempt

Aaron Allport

The Salesforce Data Architect is a prerequisite exam for the Application Architect half of the Salesforce Certified Technical Architect credential. This book offers complete, up-to-date coverage of the Salesforce Data Architect exam so you can take it with confidence.The book is written in a clear, succinct way with self-assessment and practice exam questions, covering all the topics necessary to help you pass the exam with ease. You’ll understand the theory around Salesforce data modeling, database design, master data management (MDM), Salesforce data management (SDM), and data governance. Additionally, performance considerations associated with large data volumes will be covered. You’ll also get to grips with data migration and understand the supporting theory needed to achieve Salesforce Data Architect certification.By the end of this Salesforce book, you'll have covered everything you need to know to pass the Salesforce Data Architect certification exam and have a handy, on-the-job desktop reference guide to re-visit the concepts.

495

Ebook

Salesforce Platform App Builder Certification Handbook. A handy guide that covers the most essential topics for Salesforce Platform App Builder Certification in an easy-to-understand format

Siddhesh Kabe

The Salesforce Certified Platform App Builder exam is for individuals who want to demonstrate their skills and knowledge in designing, building, and implementing custom applications using the declarative customization capabilities of Force.com. This book will build a strong foundation in Force.com to prepare you for the platform app builder certification exam. It will guide you through designing the interface while introducing the Lightning Process Builder. Next, we will implement business logic using various point and click features of Force.com. We will learn to manage data and create reports and dashboards. We will then learn to administer the force.com application by configuring the object-level, field-level, and record-level security. By the end of this book, you will be completely equipped to take the Platform App Builder certification exam.

496

Ebook

SAP Data Services 4.x Cookbook. Delve into the SAP Data Services environment to efficiently prepare, implement, and develop ETL processes

Ivan Shomnikov, Stanislav Pereyaslov

Want to cost effectively deliver trusted information to all of your crucial business functions? SAP Data Services delivers one enterprise-class solution for data integration, data quality, data profiling, and text data processing. It boosts productivity with a single solution for data quality and data integration. SAP Data Services also enables you to move, improve, govern, and unlock big data. This book will lead you through the SAP Data Services environment to efficiently develop ETL processes. To begin with, you’ll learn to install, configure, and prepare the ETL development environment. You will get familiarized with the concepts of developing ETL processes with SAP Data Services. Starting from smallest unit of work- the data flow, the chapters will lead you to the highest organizational unit—the Data Services job, revealing the advanced techniques of ETL design. You will learn to import XML files by creating and implementing real-time jobs. It will then guide you through the ETL development patterns that enable the most effective performance when extracting, transforming, and loading data. You will also find out how to create validation functions and transforms.Finally, the book will show you the benefits of data quality management with the help of another SAP solution—Information Steward.

497

Ebook

SAP NetWeaver MDM 7.1 Administrator's Guide

Uday Rao

498

Ebook

Scala: Applied Machine Learning. Master the art of Machine Learning in Scala

Patrick R. Nicolas, Alex Kozlov, Pascal Bugnion

This Learning Path aims to put the entire world of machine learning with Scala in front of you. Scala for Data Science, the first module in this course, is a tutorial guide that provides tutorials on some of the most common Scala libraries for data science, allowing you to quickly get up to speed building data science and data engineering solutions.The second course, Scala for Machine Learning guides you through the process of building AI applications with diagrams, formal mathematical notation, source code snippets, and useful tips. A review of the Akka framework and Apache Spark clusters concludes the tutorial.The next module, Mastering Scala Machine Learning, is the final step in this course. It will take your knowledge to next level and help you use the knowledge to build advanced applications such as social media mining, intelligent news portals, and more. After a quick refresher on functional programming concepts using REPL, you will see some practical examples of setting up the development environment and tinkering with data. We will then explore working with Spark and MLlib using k-means and decision trees.By the end of this course, you will be a master at Scala machine learning and have enough expertise to be able to build complex machine learning projects using Scala.This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:? Scala for Data Science, Pascal Bugnion? Scala for Machine Learning, Patrick Nicolas? Mastering Scala Machine Learning, Alex Kozlov

499

Ebook

Scala for Machine Learning. Build systems for data processing, machine learning, and deep learning - Second Edition

Patrick R. Nicolas

The discovery of information through data clustering and classification is becoming a key differentiator for competitive organizations. Machine learning applications are everywhere, from self-driving cars, engineering design, logistics, manufacturing, and trading strategies, to detection of genetic anomalies. The book is your one stop guide that introduces you to the functional capabilities of the Scala programming language that are critical to the creation of machine learning algorithms such as dependency injection and implicits. You start by learning data preprocessing and filtering techniques. Following this, you'll move on to unsupervised learning techniques such as clustering and dimension reduction, followed by probabilistic graphical models such as Naïve Bayes, hidden Markov models and Monte Carlo inference. Further, it covers the discriminative algorithms such as linear, logistic regression with regularization, kernelization, support vector machines, neural networks, and deep learning. You’ll move on to evolutionary computing, multibandit algorithms, and reinforcement learning.Finally, the book includes a comprehensive overview of parallel computing in Scala and Akka followed by a description of Apache Spark and its ML library. With updated codes based on the latest version of Scala and comprehensive examples, this book will ensure that you have more than just a solid fundamental knowledge in machine learning with Scala.

500

Ebook

Scalable Data Architecture with Java. Build efficient enterprise-grade data architecting solutions using Java

Sinchan Banerjee

Java architectural patterns and tools help architects to build reliable, scalable, and secure data engineering solutions that collect, manipulate, and publish data.This book will help you make the most of the architecting data solutions available with clear and actionable advice from an expert.You’ll start with an overview of data architecture, exploring responsibilities of a Java data architect, and learning about various data formats, data storage, databases, and data application platforms as well as how to choose them. Next, you’ll understand how to architect a batch and real-time data processing pipeline. You’ll also get to grips with the various Java data processing patterns, before progressing to data security and governance. The later chapters will show you how to publish Data as a Service and how you can architect it. Finally, you’ll focus on how to evaluate and recommend an architecture by developing performance benchmarks, estimations, and various decision metrics.By the end of this book, you’ll be able to successfully orchestrate data architecture solutions using Java and related technologies as well as to evaluate and present the most suitable solution to your clients.

501

Ebook

Scientific Computing with Python 3. Click here to enter text

Claus Führer, Jan Erik Solem, Olivier Verdier

Python can be used for more than just general-purpose programming. It is a free, open source language and environment that has tremendous potential for use within the domain of scientific computing. This book presents Python in tight connection with mathematical applications and demonstrates how to use various concepts in Python for computing purposes, including examples with the latest version of Python 3. Python is an effective tool to use when coupling scientific computing and mathematics and this book will teach you how to use it for linear algebra, arrays, plotting, iterating, functions, polynomials, and much more.

502

Ebook

scikit-learn Cookbook. Over 80 recipes for machine learning in Python with scikit-learn - Second Edition

Julian Avila, Trent Hauck

Python is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility, and within the Python data space, scikit-learn is the unequivocal choice for machine learning. This book includes walk throughs and solutions to the common as well as the not-so-common problems in machine learning, and how scikit-learn can be leveraged to perform various machine learning tasks effectively.The second edition begins with taking you through recipes on evaluating the statistical properties of data and generates synthetic data for machine learning modelling. As you progress through the chapters, you will comes across recipes that will teach you to implement techniques like data pre-processing, linear regression, logistic regression, K-NN, Naïve Bayes, classification, decision trees, Ensembles and much more. Furthermore, you’ll learn to optimize your models with multi-class classification, cross validation, model evaluation and dive deeper in to implementing deep learning with scikit-learn. Along with covering the enhanced features on model section, API and new features like classifiers, regressors and estimators the book also contains recipes on evaluating and fine-tuning the performance of your model. By the end of this book, you will have explored plethora of features offered by scikit-learn for Python to solve any machine learning problem you come across.

503

Ebook

Securing Hadoop. Implement robust end-to-end security for your Hadoop ecosystem

Sudheesh Narayan

Security of Big Data is one of the biggest concerns for enterprises today. How do we protect the sensitive information in a Hadoop ecosystem? How can we integrate Hadoop security with existing enterprise security systems? What are the challenges in securing Hadoop and its ecosystem? These are the questions which need to be answered in order to ensure effective management of Big Data. Hadoop, along with Kerberos, provides security features which enable Big Data management and which keep data secure.This book is a practitioner's guide for securing a Hadoop-based Big Data platform. This book provides you with a step-by-step approach to implementing end-to-end security along with a solid foundation of knowledge of the Hadoop and Kerberos security models.This practical, hands-on guide looks at the security challenges involved in securing sensitive data in a Hadoop-based Big Data platform and also covers the Security Reference Architecture for securing Big Data. It will take you through the internals of the Hadoop and Kerberos security models and will provide detailed implementation steps for securing Hadoop. You will also learn how the internals of the Hadoop security model are implemented, how to integrate Enterprise Security Systems with Hadoop security, and how you can manage and control user access to a Hadoop ecosystem seamlessly. You will also get acquainted with implementing audit logging and security incident monitoring within a Big Data platform.

504

Ebook

Serverless ETL and Analytics with AWS Glue. Your comprehensive reference guide to learning about AWS Glue and its features

Vishal Pathak, Subramanya Vajiraya, Noritaka Sekiyama, Tomohiro Tanaka, ...

Organizations these days have gravitated toward services such as AWS Glue that undertake undifferentiated heavy lifting and provide serverless Spark, enabling you to create and manage data lakes in a serverless fashion. This guide shows you how AWS Glue can be used to solve real-world problems along with helping you learn about data processing, data integration, and building data lakes.Beginning with AWS Glue basics, this book teaches you how to perform various aspects of data analysis such as ad hoc queries, data visualization, and real-time analysis using this service. It also provides a walk-through of CI/CD for AWS Glue and how to shift left on quality using automated regression tests. You’ll find out how data security aspects such as access control, encryption, auditing, and networking are implemented, as well as getting to grips with useful techniques such as picking the right file format, compression, partitioning, and bucketing. As you advance, you’ll discover AWS Glue features such as crawlers, Lake Formation, governed tables, lineage, DataBrew, Glue Studio, and custom connectors. The concluding chapters help you to understand various performance tuning, troubleshooting, and monitoring options.By the end of this AWS book, you’ll be able to create, manage, troubleshoot, and deploy ETL pipelines using AWS Glue.

Kategorie