• Rough draft
  • Digital Guest Care Ticket Data Ingestion

    Jira Initiatives

    TBC

    Project Status

    Draft

    Created On

    Jul 8, 2024

    Key Business Stakeholders

    @Siobhan Van Der Kley @Jonathan Brown @Victoria Quan @Rayand Ramlal

    Engineering

    @Anirudha Porwal

    Due Date

    TBD

    Objective

    Digital Business and Guest Care are looking to access the guest care tickets data coming through the Digital platforms. This will allow end-users to be able to query this data to perform routine and ad hoc descriptive reporting, as well as build dashboards off of this data to provide key stakeholders and leadership team members with self-service insights on guest care performance. An example use case for this data will be based on identifying large swings in tickets linked to specific restaurants; this will allow Digital to direct Field Team resources to specific restaurants to investigate any potential operational challenges. Lastly, it is envisioned that ML modelling (e.g. NLP) will be applied to this data to extract sentiment and provide insights on leading indicators stemming from this data. To that end, this set of requirements focuses on ingestion of this data into Digital’s Databricks environment.

    Requirements

    The following are the requirements from Business and Digital Guest Care:

    Data

    • All unmasked data fields from Snowflake table BRAND_TH.ZENDESK.TICKETS_WITH_FEEDBACK (sample: )

    • For CAN & US

    • For all periods available (minimum required from January 1st, 2020 inclusive)

    • Data types should be preprocessed, where applicable, prior to ingestion

    • Critical columns include: Store_ID, Ticket_ID, Create_Time, Update_Time, Order_Time, Country_Code, Status, User_ID, Service_Mode, Case_Level_Tags, Severity, Ticket_Tags, New_Customer_Inquiry_Field, Form_Category, Agent_Comments, Duration_since_Last_update, Time_Spent_Total, Subject, Comment, Ticket_URL, Complaint_Type, Is_deleted [Non-exhaustive - TBC based on availability of metadata]

    • Clear Indicator of Critical Timestamps (e.g. Create time of ticket, modification times, etc.)

    Location

    • Dataset should be replicated into an appropriate schema within the Loyalty catalog in Databricks

    Frequency

    • Data should be refreshed daily (at minimum) and should have the prior full-day’s data available by 07h00 EST of the immediate following day

    Metadata

    • Metadata should be provided along with the table indicating the definitions (and limitations, where applicable) of the critical columns, to begin

    Risks

    • Ingestion of PII due to free-text within the data should be risk-assessed prior to ingestion

    • Potential data lags: an assessment should be made, prior to ingestion, to determine if any data points are coming in at non-standard frequencies

    • Project timeline overruns

    • Key columns not populating consistently, e.g. loyalty IDs (User_ID - ?), Store_ID, etc.