3 days running average / capco

 data = [Row(Date='2023-01-01', ProductID=100, QuantitySold=10),

        Row(Date='2023-01-02', ProductID=100, QuantitySold=15),

        Row(Date='2023-01-03', ProductID=100, QuantitySold=20),

        Row(Date='2023-01-04', ProductID=100, QuantitySold=25),

        Row(Date='2023-01-05', ProductID=100, QuantitySold=30),

        Row(Date='2023-01-06', ProductID=100, QuantitySold=35),

        Row(Date='2023-01-07', ProductID=100, QuantitySold=40),

        Row(Date='2023-01-08', ProductID=100, QuantitySold=45)]

###########################################################################

I have the following sales data for a product, with the quantity sold on each date:


Date ProductID QuantitySold

2023-01-01 100 10

2023-01-02 100 15

2023-01-03 100 20

2023-01-04 100 25

2023-01-05 100 30

2023-01-06 100 35

2023-01-07 100 40

2023-01-08 100 45

I need to calculate the 3-day running average of the QuantitySold for this product.

use spark/scala/pyspark for calculation

Expected Output

Date ProductID QuantitySold RunningAvg3Days

2023-01-01 100       10 10.00

2023-01-02 100       15 12.50

2023-01-03 100       20 15.00

2023-01-04 100       25 20.00

2023-01-05 100       30 25.00

2023-01-06 100       35 30.00

2023-01-07 100       40 35.00

2023-01-08 100       45 40.00

Comments

Popular posts from this blog

sql question set 1

SCD Type 2 in MySQL/ coforge