Business Intelligence

Description

Data analysis with the  Microsoft  PowerBI tool.  Features:

  • Individual jobs.
    • Evaluation: Content and presentation of the work (it is mandatory to defend the work).
    • Submit in virtual.ipb.pt in the Activity ” Practical Work – PowerBI”

o No other  submitted  papers are  accepted.

2 files must be submitted ( separately or zipped):

  1.   PowerBI file (pbix)
  2. Report (pdf)

Evaluation methodology

  • Alternative – (Ordinary, Worker) (Final, Appeal, Special)
  • Practical Work – 50%
  • Final Written Exam – 50% (Minimum Exam Score : 7 values)

“Practical work – PowerBI”)

Work

         Description

The dataset for this competition is a relational set of files describing customers’ orders over time. The goal of the competition is to predict which products will be in a user’s next order.  The dataset is anonymized and contains a sample of over 3 million grocery orders from more than 200,000 Instacart users. For each user, we provide between 4 and 100 of their orders, with the sequence of products purchased in each order.  We

also provide the week and hour of day the order was placed, and a relative measure of time between orders. For more information, see the blog post accompanying its public release

Dataset: https://cloud.ipb.pt/f/63227e65fa294666a8fe/

  • Description of files

Each entity (customer, product, order, aisle, etc.) has an associated unique id.  Most of the files and variable names should be self-explanatory.

  1. aisles.csv

aisle_id,aisle 1,prepared soups salads 2,specialty cheeses 3,energy granola bars

  • departments.csv

department_id,department 1,frozen

2,other 3,bakery

  • order_products *.csv

These files specify which products were purchased in each order. order_products prior.csv contains previous order contents for all customers.  ‘reordered’ indicates that the customer has a previous order that contains the product.  Note that some orders will have no reordered items.  You may predict an explicit ‘None’ value for orders with no reordered items.

order_id,product_id,add_to_cart_order,reordered 1,49302,1,1

1,11109,2,1

1,10246,3,0

  • orders.csv

This file tells to which set (prior, train, test) an order belongs.  You are predicting reordered items only for the test set orders. ‘order_dow’ is the day of week (0 and 1 is Saturday and Sunday).

order_id,user_id,eval_set,order_number,order_dow,order_hour_of_day,days_since_prior_order 2539329,1,prior,1,2,08,

2398795,1,prior,2,3,07,15.0

473747,1,prior,3,3,12,21.0

  • products.csv

product_id,product_name,aisle_id,department_id 1,Chocolate Sandwich Cookies,61,19

2,All-Seasons Salt,104,13

3,Robust Golden Unsweetened Oolong Tea,94,7

  • users.csv

user_id;bd;sex;country

1; Saturday, September 14, 1991; M;Poland

2; Thursday, November 30, 1989; M;Central African Republic 3; Friday, October 20, 1967; M;France…

         Structure

Understanding of the dataset structure:

  • users are identified by user_id in the orders csv file.  Each row of the orders csv fil represents an order made by a user. Order are identified by order_id;
  • Each order of a user is characterized by an order_number which specifies when it has been made with respect to the others of the same user;
  • each order consists of a set of product each characterized by an add_to_cart_order feature representing the sequence in which they have been added to the cart in that order;
  • for each user we may have n-1 prior orders and 1 train order OR n-1 prior orders and 1 test order in which we have to state what products have been reordered.

      Purpose of the work

Those responsible for the site want  the information system  to produce reports that allow to improve decision making.  Thus, the person responsible wishes to obtain answers to the following questions:

  1. Characterization of clients by age group , gender and nationality .
  2. On which day of the week ( and time) do women do the most shopping?
  3. What are the most important departments and corridors   (by number of products)?
  4. What are the most important corridors  in each department (by number of products).
  5.   What are customers’ favorite departments and corridors   (by age, gender and country)?
  6. When do customers do the most shopping : day of the week , time of day , department and corridor .
  7. What is the day of the week when we have the most new clients ( by age, gender and country)??
  8. How many products ( on average) are purchased by customers on the first order?
  9. How  often do  customers shop (total days between  purchases)?  Do  underage customers shop more often?
  10. How many products do customers buy in each order (by day of the week and time of day)?
  11. How often do customers buy the same items again (reorders) – % of products in an order that were previously ordered?
  12. What products are best sold in the morning (0-12) and afternoon  (12-24)?  Which type does  you do the most shopping in the two periods?
  13.   Top 10 top-selling  products on a given day of the week, department and corridor.
  14. Which products  (Top 12) are most likely  to be purchased again?
  15. What products  (Top 10) do  customers put  first in  the cart (by age , gender and country)?
  16. Is there any association between the date of the last order and the probability of a new order?
  17. Is there any association between number of requests and probability  of re-order ?
  18. What is the percentage of  organic requests  vs.     non-organic (by age, sex and country)??
  19. Which  customers always buy the  same products ( reorders).
  20. For a given product, identify the customers most  likely  to buy.
  21. Identify behavior pad  sons (Dataming).

Need help with this assignment or a similar one? Place your order and leave the rest to our experts!

Quality Assured!

Always on Time

Done from Scratch.