Reflection on Databricks

Databricks creates delta tables by default which is beneficial because it provides transaction history/log.
Databricks’ COPY INTO ingestion technique is idempotent, which means that even if you run the command or notebook cell multiple times, databricks wouldn’t insert data, thus preventing duplicates from creeping into your table. It can also even handle schema changes/evolution
- When ingesting data to an existing table in Snowflake (using full load technique), the process typically looks like this:
  - load the data into a staging table,
  - truncate the raw (existing) table
  - then load data from staging table to raw table.
Databricks is an end-to-end solution for all data-related projects.
- It provides tools for data ingestion, transformation, ETL pipelines, dashboards, and AI models.
- Ultimately, it can provide a unique context of your business through AI.

Wil's Second Brain