REGISTER

CONFERENCE FOR

Data engineersdata scientistsdata architects

data and business analystssoftware developers

anyone interested in learning more about data

Data engineersdata scientists

data architectsdata and business analysts

software developersanyone interested in learning more about data

Welcome to the Big Data Fest 2023 — the premier online event for the big data community.

Attendees will have the opportunity to learn about new big data tools and technologies from industry leaders and practitioners. All sessions will be conducted online and accessible to participants worldwide.

This annual event will provide a platform for the big data community to come together, network, share knowledge and insights, and stay current on the latest advancements in the field.

The event is free, but you can donate to the Open Eyes Charity Fund to buy medical equipment for hospitals in Ukraine.

Join us and be part of shaping the future of data engineering.
See you at Big Data Fest 2023!

Circles

Why attend?

item

Stay current on the latest big data and data engineering trends

item

Gain valuable insights from industry leaders

item

Expand professional network

item

Learn about new tools and technologies

item

Be a part of shaping the future of data engineering

AGENDA

Day 1

TIME ZONE (GMT+3)

16:00

Intro

16:10

Taras Bachynskyi, AVP of Technology@SoftServe


Industrial Metaverse. The Evolution of Digital Twins

Keep up with rising trends in the software world! Recent technology advancements created new enabling capabilities for Digital Twins and Industrial Metaverse, a new frontier for large companies in the industrial sector. Please join this inspirational session about Digital Twins and Industrial Metaverse, where you will learn more about the concept and technologies behind.

16:45

James Serra, Big Data/Data Warehouse Evangelist@Microsoft


Data Lakehouse, Data Mesh, and Data Fabric
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I'll cover all of them in detail and compare them all. I'll include use cases so you can see what approach will work best for your company.

17:30

Break

17:35

Rodion Myronov, Director, Big Data & Analytics@SoftServe
Joe Reis, CEO@Ternary Data
Emilie Lundblad, Managing director@Amesto NextbridgeTBA
Thiago de Faria, Hands-on AI/ML Cloud Architect@TilinTheCloud

Panel Discussion "Future of Big Data and Data Engineering"

18:20

Break

18:25

Jennifer Stirrup, Founder and CEO@Data Relish


The Metaverse, Big Data and Analytics: How can decision-makers harness the Metaverse?
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present and the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our Business Intelligence isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives? How can you help your company to evolve, adapt and succeed using Artificial Intelligence and the Metaverse to stay at the forefront of the competition? What potential issues, complications, and benefits could these technologies bring to our organizations and us? In this session, we will investigate how to start thinking about these technologies as an organization.

19:10

Ran Tibi, Tech Lead & Data consulting


Keep Your Data Encrypted in BigQuery
If you work with data and build a data warehouse you probably have some sensitive data that you want to keep secured. While BigQuery encrypts all the data before it is written to the disk, once you have read access to the tables you can have full visibility of the data, including sensitive data and PII. In this talk, we will describe a use case that shows how you can create a secure end-to-end process that encrypts at the application level the data before inserting it into BigQuery and allow users to decrypt it only in query time without the need of knowing the actual encryption key. We will discuss important principles of data pipeline protection and we will see which out-of-the-box tools BigQuery provides us in order to perform such tasks. Even if you don't use BigQuery as your data warehouse solution, you may still benefit from this talk, as you might get ideas on how to implement such principles on your own data platform.

19:55

Ralph Richards, Solutions Architect@AWS


Low Code, No Servers Data Engineering on AWS
With data becoming ubiquitous, more and more personas want to analyze, process, and monetize that data. The trouble is that data preparation is tedious, and Data Engineers are unicorns with hard-to-find skills. This session covers the importance of enabling self-service and ease of use for all personas. We will explore how AWS native services provide that ease of use and self-service for data engineering and BI.

Day 2

TIME ZONE (GMT+3)

16:00

Intro

16:05

Stewart Bryson, Head of Customer Experience@Coalesce.io


Does Modeling Still Matter in the Data Cloud?
It’s easier than ever to build analytics products and a data-driven culture. With cloud-native databases and modern data engineering platforms, we have the performance and scale to build limitless solutions without heavy lifting. This raises the question: should we use the same modeling techniques that emerged from data warehouses constrained by compute and storage? In this session, I’ll contend that traditional modeling strategies evolved in response to technical limitations that don’t exist today in the data cloud. I’ll identify a few key architectural concepts that should exist in your modern data stack, regardless of whether you’re building a data lake, warehouse, or lakehouse.

16:50

Timothy Spann, Principal Developer Advocate@Cloudera


Building Modern Data Streaming Apps
In my session, I will show you some best practices I have discovered over the last 7 years in building data streaming applications including IoT, CDC, Logs, and more. In my modern approach, we utilize several open-source frameworks to maximize the best features of all. We often start with Apache NiFi as the orchestrator of streams flowing into Apache Pulsar and/or Apache Kafka. From there we build streaming ETL with Apache Spark and enhance events with serverless functions for ML and enrichment. We build continuous queries against our topics with Flink SQL. We will stream data into Iceberg and other data stores.

17:30

Break

17:35

Barkha Herman, Technologist, Podcaster, WiT Advocate and mentor@StarTree

Realtime Analytics Unleashed: Unlocking the Potential of Data at Scale
In this talk, we will explore the exciting possibilities of modern analytics solutions – going beyond traditional business dashboards, and diving into how to provide OLAP at OLTP speeds to a larger audience. Learn about realtime analytics, and how scale can be achieved in this rapidly evolving field. Tech: Covering Apache Pinot, startree Indexs, k8, running on Azure.

18:20

André Melancia, Developer / DBA / Microsoft Certified Trainer (MCT)

Closing in on OpenAI
Have data? Don't know what to do with it? You can process it using OpenAI. You probably already tried ChatGPT, but there is much more to OpenAI. In this demo driven session we'll see a few practical examples on how you can use it. Disclaimer: Be nice to bots, and they'll treat you better when they take over the world.

19:05

Break

19:10

Denny Cherry, Owner and Principal Consultant@Denny Cherry & Associates Consulting


Choosing the Right Data Store-An Overview of Cloud Data Platform Choices
Choosing the right data store can make or break the performance and costs of your application. In this session, you will learn about all the options available in the Data Platform space. You'll learn about CosmosDB, Azure SQL Database, Postgres, MySQL services, Databricks, Redshift, RDS, etc. You will also learn how to leverage caching to reduce the overall workload on your data tier. We will discuss the notion of polyglot persistence, storing different data in different data stores with the goal of optimizing your overall data architecture for flexibility and performance.

19:55

Jean Joseph, Technical Trainer/Data Engineer@Microsoft


Advance Data Pipeline Architecture: Key Design Principles & Considerations
Data pipelines are meant to transfer data from disparate sources or legacies to a target system. Easy right? Well I am not sure about that. As a Data Engineer, it’s our job to be responsible for multiple different data pipeline architecture and decisions during the design phase. Join my session to learn all about Data Pipeline Architecture: Building Blocks, Diagrams, and Patterns moreover Best Practices and what technologies you should use.

Speakers

Committee Image

Rodion Myronov

Director, Big Data & Analytics, SoftServe

Committee Image

James Serra

Big Data/Data Warehouse Evangelist, Microsoft

Committee Image

Jennifer Stirrup

Founder and CEO, Data Relish

Committee Image

Taras Bachynskyi

AVP of Thechnology, SoftServe

Committee Image

Joe Reis

CEO, Ternary Data

Committee Image

André Melancia

Developer / DBA / Microsoft Certified Trainer (MCT)

Committee Image

Barkha Herman

Technologist, Podcaster, WiT Advocate and mentor, StarTree

Committee Image

Jean Joseph

Technical Trainer/ Data Engineer, Microsoft

Committee Image

Timothy Spann

Principal Developer Advocate, Cloudera

Committee Image

Ralph Richards

Solutions Architect, AWS

Committee Image

Stewart Bryson

Head of Customer Experience, Coalesce.io

Committee Image

Ran Tibi

Tech Lead & Data consulting

Committee Image

Denny Cherry

Owner and Principal Consultant, Denny Cherry & Associates Consulting

Committee Image

Emilie Lundblad

Managing director, Amesto Nextbridge

Speakers

Committee Image

Rodion Myronov

Director, Big Data & Analytics, SoftServe

Committee Image

James Serra

Big Data/Data Warehouse Evangelist, Microsoft

Committee Image

Jennifer Stirrup

Founder and CEO, Data Relish

Committee Image

Taras Bachynskyi

AVP of Thechnology, SoftServe

Committee Image

Joe Reis

CEO, Ternary Data

Committee Image

André Melancia

Developer / DBA / Microsoft Certified Trainer (MCT)

Committee Image

Barkha Herman

Technologist, Podcaster, WiT Advocate and mentor, StarTree

Committee Image

Jean Joseph

Technical Trainer/ Data Engineer, Microsoft

Committee Image

Timothy Spann

Principal Developer Advocate, Cloudera

Committee Image

Ralph Richards

Solutions Architect, AWS

Committee Image

Stewart Bryson

Head of Customer Experience, Coalesce.io

Committee Image

Ran Tibi

Tech Lead & Data consulting

Committee Image

Denny Cherry

Owner and Principal Consultant, Denny Cherry & Associates Consulting

Committee Image

Emilie Lundblad

Managing director, Amesto Nextbridge

bg bg
elements

Registration

Program committee

Committee Image

Taras Kloba

Big Data Competence Manager, SoftServe

Committee Image

Anu Venkataraman

Data Lake Engineer, Google

Committee Image

Anil Sener

Senior Data & AI Cloud Solutions Architect, Microsoft

Committee Image

John Mousa

Sr. Solutions Architect, AWS