Skip to content

Latest commit

 

History

History
257 lines (222 loc) · 8.03 KB

File metadata and controls

257 lines (222 loc) · 8.03 KB

📊 SQL Insights for E-commerce

This repository collects SQL queries used for real marketing and sales analyses in an e-commerce project, aiming to deliver practical and strategic business insights. Each analysis includes the business question, the SQL query used, and a brief interpretation of the results.


🧱 Table Structure

dim_cliente : customer data (CPF, name, email, birth date, location).
dim_produto : product catalog (ID, SKU, name, category).
fato_vendas : transactional sales data (order, product, value, status, shipping, customer).


📋 Business Questions and SQL Analyses

1. What are the top 10 neighborhoods with the highest number of registered customers?

SELECT 
  bairro,
  COUNT(*) AS total_cliente,
  ROUND(COUNT(*) * 100 / SUM(COUNT(*)) OVER (), 2)||'%' AS percentual_clientes
FROM 
  dim_cliente
GROUP BY 
  bairro
ORDER BY 
  total_cliente DESC
LIMIT 10;

Insight: Helps identify priority regions for local campaigns, giveaways, or pickup partnerships.
Top 10 neighborhoods

2. What is the total sales value per month in the last year?

SELECT 
  DATE_PART('month', data_criacao) AS mes,
  SUM(valor_total - valor_envio) AS total_vendido
FROM (
  SELECT DISTINCT pedido_numero,
    cpf_cliente, endereco_entrega_nome,
    valor_total, valor_envio, data_criacao
  FROM fato_vendas
  WHERE pedido_situacao NOT IN ('Pedido Cancelado', 'Pagamento devolvido')
) pedidos_unicos
WHERE DATE_PART('year', data_criacao) = 2025
GROUP BY DATE_PART('month', data_criacao)
ORDER BY mes;

Insight: Reveals sales seasonality and supports planning promotions for slower months. total sales

3. Which are the top 5 best-selling products by quantity?

SELECT 
  produto_nome, 
  COUNT(*) AS qtd_vendido
FROM fato_vendas
GROUP BY produto_nome
ORDER BY qtd_vendido DESC
LIMIT 5;

Insight: Highlights the most popular items for featured promotions and stock prioritization. top 5 best-selling

4. How many orders were placed by each payment method?

SELECT 
  pagamento_nome, 
  COUNT(*) AS qtd
FROM (
  SELECT DISTINCT pedido_numero, cpf_cliente, pagamento_nome
  FROM fato_vendas
  WHERE pedido_situacao NOT IN ('Pedido Cancelado','Pagamento devolvido')
) pedidos_unicos
GROUP BY pagamento_nome
ORDER BY qtd DESC;

Insight: Shows preferred payment options and informs strategic offers (e.g., Pix, boleto, credit card). payment method

5. What is the average ticket per customer?

select 
  cpf_cliente,
  MIN(endereco_entrega_nome) AS nome_cliente,
  count(*) as qtd_pedidos,
  sum(valor_comprado_real) as total_cliente_comprou,
  round(sum(valor_comprado_real)/count(*),2) as ticket_medio_cliente
from(
  select distinct pedido_numero,
  cpf_cliente,
  endereco_entrega_nome,
  (valor_total-valor_envio) as valor_comprado_real
  from
    fato_vendas
  where
	pedido_situacao not in ('Pedido Cancelado','Pagamento devolvido') and endereco_entrega_nome!='CLIENTE WPP'
  order by
	pedido_numero
) pedidos_unicos
group by
cpf_cliente
order by
total_cliente_comprou desc

Insight: Enables segmentation for campaigns aimed at increasing average spend and loyalty programs. average ticket

6. Which product categories generate the highest revenue?

SELECT
  categoria,
  SUM(preco_produto) AS total_revenue
FROM (
  SELECT 
    fv.produto_id,
    fv.produto_nome,
    fv.preco_venda AS preco_produto,
    dp.categoria
  FROM fato_vendas fv 
  JOIN dim_produto dp ON dp.id_produto = fv.produto_id
  WHERE pedido_situacao NOT IN ('Pedido Cancelado','Pagamento devolvido')
) sales_by_category
GROUP BY categoria
ORDER BY total_revenue DESC
LIMIT 10;

Insight: Guides inventory prioritization and targeted category promotions. best categories

7. What percentage of customers are repeat buyers (2+ orders)?

WITH filtered_orders AS (
  SELECT DISTINCT pedido_numero, cpf_cliente
  FROM fato_vendas
  WHERE pedido_situacao NOT IN ('Pedido Cancelado','Pagamento devolvido')
    AND endereco_entrega_nome != 'CLIENTE WPP'
),
orders_per_customer AS (
  SELECT cpf_cliente, COUNT(*) AS order_count
  FROM filtered_orders
  GROUP BY cpf_cliente
),
repeat_customers AS (
  SELECT cpf_cliente
  FROM orders_per_customer
  WHERE order_count > 1
)
SELECT
  ROUND(100.0 * COUNT(*) / (SELECT COUNT(*) FROM orders_per_customer), 2) || '%' 
    AS repeat_customer_percentage
FROM repeat_customers;

Insight: Measures customer loyalty and informs retention strategies.

8. Which categories sell the most in each neighborhood?

SELECT
  fv.endereco_entrega_bairro,
  dp.categoria,
  COUNT(*) AS qtd_vendas
FROM fato_vendas fv 
JOIN dim_produto dp ON dp.id_produto = fv.produto_id
WHERE pedido_situacao NOT IN ('Pedido Cancelado','Pagamento devolvido')
GROUP BY fv.endereco_entrega_bairro, dp.categoria
HAVING COUNT(*) > 1
ORDER BY fv.endereco_entrega_bairro, qtd_vendas DESC;

Insight: Enables geo-targeted campaigns and optimized local inventory mixes. best categories

9. What is the average ticket per product category?

SELECT
  categoria,
  SUM(preco_produto) AS total_vendido,
  COUNT(*) AS total_sales,
  ROUND(SUM(preco_produto) / COUNT(*), 2) AS avg_ticket_category
FROM (
  SELECT 
    fv.produto_id,
    fv.produto_nome,
    fv.preco_venda AS preco_produto,
    dp.categoria
  FROM fato_vendas fv 
  JOIN dim_produto dp ON dp.id_produto = fv.produto_id
  WHERE pedido_situacao NOT IN ('Pedido Cancelado','Pagamento devolvido')
) sales_by_category
GROUP BY categoria
ORDER BY avg_ticket_category DESC
LIMIT 10;

Insight: Identifies high-value categories and opportunities for cross-sell promotions. ticket per category

10. How are customers distributed by age group?

SELECT
  COUNT(*) AS customer_count,
  ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) || '%' AS customer_percentage,
  CASE
    WHEN EXTRACT(YEAR FROM AGE(CURRENT_DATE, data_nascimento)) BETWEEN 0 AND 17 THEN '0-17 years'
    WHEN EXTRACT(YEAR FROM AGE(CURRENT_DATE, data_nascimento)) BETWEEN 18 AND 23 THEN '18-23 years'
    WHEN EXTRACT(YEAR FROM AGE(CURRENT_DATE, data_nascimento)) BETWEEN 24 AND 29 THEN '24-29 years'
    WHEN EXTRACT(YEAR FROM AGE(CURRENT_DATE, data_nascimento)) BETWEEN 30 AND 35 THEN '30-35 years'
    ELSE '35+ years'
  END AS age_group
FROM dim_cliente
GROUP BY age_group
ORDER BY customer_count DESC;

Insight: Helps tailor marketing messages to demographic segments. ticket per product

11. What is the total sold per year/month?

SELECT 
  TO_CHAR(data_criacao, 'YYYY/MM') AS year_month,
  SUM(valor_total - valor_envio) AS total_vendido
FROM (
  SELECT DISTINCT pedido_numero,
    cpf_cliente, endereco_entrega_nome,
    valor_total, valor_envio, data_criacao
  FROM fato_vendas
  WHERE pedido_situacao NOT IN ('Pedido Cancelado','Pagamento devolvido')
) pedidos_unicos
GROUP BY year_month
ORDER BY year_month;

Insight: Tracks sales evolution over time and identifies seasonal trends. total sold


📌 How to Contribute

Suggestions for new queries and analyses are welcome! Please open an issue or submit a pull request.


🚀 About

Built to consolidate key marketing and sales analyses using SQL in a real e-commerce context.