24 – 25 May 2023 online conference
Big Data Fest
Navigating the Future of Data Engineering
CONFERENCE FOR
•Data engineers•data scientists•data architects
•data and business analysts•software developers
•anyone interested in learning more about data
•Data engineers•data scientists
•data architects•data and business analysts
•software developers•anyone interested in learning more about data
Welcome to the Big Data Fest 2023 — the premier online event for the big data community.
Attendees will have the opportunity to learn about new big data tools and technologies from industry
leaders and practitioners. All sessions will be conducted online and accessible to participants worldwide.
This annual event will provide a platform for the big data community to come together, network, share knowledge and insights, and stay current on the latest advancements in the field.
The event is free, but you can donate to the Open Eyes Charity Fund to buy medical equipment for hospitals in Ukraine.
Join us and be part of shaping the future of data engineering.
See you at Big Data Fest 2023!
Why attend?
Stay current on the latest big data and data engineering trends
Gain valuable insights from industry leaders
Expand professional network
Learn about new tools and technologies
Be a part of shaping the future of data engineering
The event is free, but you can donate to the Open Eyes Charity Fund to buy medical equipment for hospitals in Ukraine.
DONATEAGENDA
Day 1
TIME ZONE (GMT+3)
16:00
Intro
16:10
Taras Bachynskyi, AVP of Technology@SoftServe
Industrial Metaverse. The Evolution of Digital Twins
Keep up with rising trends in the software world! Recent technology advancements created new enabling capabilities for Digital Twins and Industrial Metaverse, a new frontier for large companies in the industrial sector. Please join this inspirational session about Digital Twins and Industrial Metaverse, where you will learn more about the concept and technologies behind.
16:45
James Serra, Big Data/Data Warehouse Evangelist@Microsoft
Data Lakehouse, Data Mesh, and Data Fabric
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I'll cover all of them in detail and compare them all. I'll include use cases so you can see what approach will work best for your company.
17:30
Break
17:35
Rodion Myronov, Director, Big Data & Analytics@SoftServe
Joe Reis, CEO@Ternary Data
Emilie Lundblad, Managing director@Amesto NextbridgeTBA
Thiago de Faria, Hands-on AI/ML Cloud Architect@TilinTheCloud
Panel Discussion "Future of Big Data and Data Engineering"
18:20
Break
18:25
Jennifer Stirrup, Founder and CEO@Data Relish
The Metaverse, Big Data and Analytics: How can decision-makers harness the Metaverse?
The Metaverse is popularized in science fiction, and now it
is becoming closer to being a part of our daily lives through the
use of social media and shopping companies. How can businesses
survive in a world where Artificial Intelligence is becoming the
present and the future of technology, and how does the Metaverse
fit into business strategy when futurist ideas are developing into
reality at accelerated rates? How do we do this when our Business
Intelligence isn't up to scratch? How can we move towards success
with our data so we are set up for the Metaverse when it arrives?
How can you help your company to evolve, adapt and succeed using
Artificial Intelligence and the Metaverse to stay at the forefront
of the competition? What potential issues, complications, and benefits
could these technologies bring to our organizations and us?
In this session, we will investigate how to start thinking about
these technologies as an organization.
19:10
Ran Tibi, Tech Lead & Data consulting
Keep Your Data Encrypted in BigQuery
If you work with data and build a data warehouse you probably have some sensitive data
that you want to keep secured. While BigQuery encrypts all the data before it is written to
the disk, once you have read access to the tables you can have full visibility of the data,
including sensitive data and PII. In this talk, we will describe a use case that shows how
you can create a secure end-to-end process that encrypts at the application level the data
before inserting it into BigQuery and allow users to decrypt it only in query time without
the need of knowing the actual encryption key. We will discuss important principles of data
pipeline protection and we will see which out-of-the-box tools BigQuery provides us in order
to perform such tasks. Even if you don't use BigQuery as your data warehouse solution, you may
still benefit from this talk, as you might get ideas on how to implement such principles on your own data platform.
19:55
Ralph Richards, Solutions Architect@AWS
Low Code, No Servers Data Engineering on AWS
With data becoming ubiquitous, more and more personas want to analyze, process, and monetize that data. The trouble is that data preparation is tedious, and Data Engineers are unicorns with hard-to-find skills.
This session covers the importance of enabling self-service and ease of use for all personas. We will explore how AWS native services provide that ease of use and self-service for data engineering and BI.
Day 2
TIME ZONE (GMT+3)
16:00
Intro
16:05
Stewart Bryson, Head of Customer Experience@Coalesce.io
Does Modeling Still Matter in the Data Cloud?
It’s easier than ever to build analytics products and a data-driven culture. With cloud-native databases and modern data engineering platforms, we have the performance and scale to build limitless solutions without heavy lifting. This raises the question: should we use the same modeling techniques that emerged from data warehouses constrained by compute and storage? In this session, I’ll contend that traditional modeling strategies evolved in response to technical limitations that don’t exist today in the data cloud. I’ll identify a few key architectural concepts that should exist in your modern data stack, regardless of whether you’re building a data lake, warehouse, or lakehouse.
16:50
Timothy Spann, Principal Developer Advocate@Cloudera
Building Modern Data Streaming Apps
In my session, I will show you some best practices I have discovered over the last 7 years in building data streaming applications including IoT, CDC, Logs, and more. In my modern approach, we utilize several open-source frameworks to maximize the best features of all. We often start with Apache NiFi as the orchestrator of streams flowing into Apache Pulsar and/or Apache Kafka. From there we build streaming ETL with Apache Spark and enhance events with serverless functions for ML and enrichment. We build continuous queries against our topics with Flink SQL. We will stream data into Iceberg and other data stores.
17:30
Break
17:35
Barkha Herman, Technologist, Podcaster, WiT Advocate and mentor@StarTree
Realtime Analytics Unleashed: Unlocking the Potential of Data at Scale
In this talk, we will explore the exciting possibilities of modern analytics solutions – going beyond traditional business dashboards, and diving into how to provide OLAP at OLTP speeds to a larger audience. Learn about realtime analytics, and how scale can be achieved in this rapidly evolving field. Tech: Covering Apache Pinot, startree Indexs, k8, running on Azure.
18:20
André Melancia, Developer / DBA / Microsoft Certified Trainer (MCT)
Closing in on OpenAI
Have data? Don't know what to do with it? You can process it using OpenAI. You probably already tried ChatGPT, but there is much more to OpenAI. In this demo driven session we'll see a few practical examples on how you can use it. Disclaimer: Be nice to bots, and they'll treat you better when they take over the world.
19:05
Break
19:10
Denny Cherry, Owner and Principal Consultant@Denny Cherry & Associates Consulting
Choosing the Right Data Store-An Overview of Cloud Data Platform Choices
Choosing the right data store can make or break the performance and costs of your application. In this session, you will learn about all the options available in the Data Platform space. You'll learn about CosmosDB, Azure SQL Database, Postgres, MySQL services, Databricks, Redshift, RDS, etc. You will also learn how to leverage caching to reduce the overall workload on your data tier. We will discuss the notion of polyglot persistence, storing different data in different data stores with the goal of optimizing your overall data architecture for flexibility and performance.
19:55
Jean Joseph, Technical Trainer/Data Engineer@Microsoft
Advance Data Pipeline Architecture: Key Design Principles & Considerations
Data pipelines are meant to transfer data from disparate sources or legacies to a target system. Easy right? Well I am not sure about that. As a Data Engineer, it’s our job to be responsible for multiple different data pipeline architecture and decisions during the design phase. Join my session to learn all about Data Pipeline Architecture: Building Blocks, Diagrams, and Patterns moreover Best Practices and what technologies you should use.
Speakers
Speakers
Registration
Program committee
Taras Kloba
Big Data Competence Manager, SoftServe
Anu Venkataraman
Data Lake Engineer, Google
Anil Sener
Senior Data & AI Cloud Solutions Architect, Microsoft
John Mousa
Sr. Solutions Architect, AWS