Q01🔍Eksplorasi & Profiling Data☆☆☆☆Copy-pasteSafe to Run

Row Count & Basic Stats per Tabel

Dialect:bigquerypostgresqlsnowflakemysql
#row-count#profiling#basic-stats#exploration

Kapan Pakai Query Ini

Query pertama yang kamu jalanin setiap kali sentuh tabel baru. Kasih gambaran awal: seberapa besar tabelnya, date range-nya sampai mana, dan ada nggak ID yang duplikat. Lima detik untuk tahu apakah data ini "masuk akal" sebelum kamu lanjut deeper.

Yang Perlu Diganti

  • your_schema.your_table → nama tabel kamu
  • id → kolom primary key atau ID unik di tabel
  • created_at → kolom timestamp atau tanggal tabel dibuat / event terjadi

Query

SELECT
  COUNT(*)           AS total_rows,
  COUNT(DISTINCT id) AS unique_ids,
  COUNT(*) - COUNT(DISTINCT id) AS duplicate_id_count,
  MIN(created_at)    AS earliest_record,
  MAX(created_at)    AS latest_record,
  DATE_DIFF(DATE(MAX(created_at)), DATE(MIN(created_at)), DAY) AS date_span_days
FROM your_schema.your_table;

Contoh Output

total_rowsunique_idsduplicate_id_countearliest_recordlatest_recorddate_span_days
1,250,0001,248,9321,0682023-01-01 00:00:002024-12-31 23:59:59730

Variasi

  • Hapus kolom date_span_days kalau tabel kamu tidak punya kolom tanggal
  • Ganti COUNT(DISTINCT id) dengan kolom ID yang relevan. Kalau belum tahu mana ID-nya, coba order_id, user_id, atau transaction_id
  • Tambah WHERE created_at >= CURRENT_DATE - 30 untuk scope hanya 30 hari terakhir
  • Kalau tabel besar dan query lambat, tambah TABLESAMPLE SYSTEM (10 PERCENT) setelah nama tabel untuk estimasi cepat

Variasi Dialect

PostgreSQL

SELECT
  COUNT(*)           AS total_rows,
  COUNT(DISTINCT id) AS unique_ids,
  COUNT(*) - COUNT(DISTINCT id) AS duplicate_id_count,
  MIN(created_at)    AS earliest_record,
  MAX(created_at)    AS latest_record,
  (MAX(created_at)::DATE - MIN(created_at)::DATE) AS date_span_days
FROM your_schema.your_table;

Snowflake

SELECT
  COUNT(*)           AS total_rows,
  COUNT(DISTINCT id) AS unique_ids,
  COUNT(*) - COUNT(DISTINCT id) AS duplicate_id_count,
  MIN(created_at)    AS earliest_record,
  MAX(created_at)    AS latest_record,
  DATEDIFF('day', MIN(created_at), MAX(created_at)) AS date_span_days
FROM your_schema.your_table;

MySQL

SELECT
  COUNT(*)           AS total_rows,
  COUNT(DISTINCT id) AS unique_ids,
  COUNT(*) - COUNT(DISTINCT id) AS duplicate_id_count,
  MIN(created_at)    AS earliest_record,
  MAX(created_at)    AS latest_record,
  DATEDIFF(MAX(created_at), MIN(created_at)) AS date_span_days
FROM your_schema.your_table;
Row Count & Basic Stats per Tabel — Analyst SQL Vault | NgulikSQL