This code is designed to build and analyze a mathematical model, specifically a Markov chain, to simulate customer transitions through various states. The model is then used to test the outputs of an AI, in this case, a large language model (LLM), against the expected behavior in the real-world model. The chi-squared test is employed to understand how the probability distributions vary between the AI's outputs and the expected real-world transitions.
- Numpy: For numerical operations and handling arrays.
- Transformers: For loading a pre-trained tokenizer and model.
- Sentence Transformers: For loading a cross-encoder model.
- Scipy: For statistical analysis, specifically the chi-squared test.
- States: Defined as:
- "New Customer"
- "Engaged Customer"
- "Interested in Upgrade"
- "Requested Upgrade Info"
- "Upgraded"
- Transition Matrix: A matrix defining the probabilities of transitioning from one state to another.
- Tokenizer and Model: Loaded from the BERT base uncased model.
- Cross-Encoder: Loaded from the
ms-marco-MiniLM-L-6-v2
model.
- get_state_index(state): Returns the index of a given state.
- determine_state(sentence, states): Uses the cross-encoder to determine the most likely state for a given sentence.
- get_transition_probabilities(current_state, transition_matrix): Retrieves the transition probabilities for a given state.
- Dialogs: A dictionary containing multiple dialogs, each represented as a list of sentences.
- Processing Logic:
- Each sentence in a dialog is processed to determine its state.
- Transitions between states are recorded.
- State counts and transition counts are updated.
- Observed transition probabilities are calculated and compared against the expected probabilities using a chi-squared test.
- Chi-Squared Test:
- The chi-squared test compares the observed transition probabilities (from the AI's output) against the expected transition probabilities (from the Markov model).
- It calculates the chi-squared statistic and p-value to determine if there is a significant difference between the observed and expected probabilities.
- A p-value less than the significance level (alpha = 0.05) indicates that the observed probabilities significantly differ from the expected probabilities.
- Building a Markov Chain: The code constructs a Markov chain model to simulate customer transitions through different states.
- Testing AI Outputs: The AI's outputs are tested against the expected behavior in the real-world model.
- Understanding Probability Distributions: The chi-squared test is used to understand how the probability distributions vary between the AI's outputs and the expected transitions in the Markov model.
This approach helps in assessing the performance and reliability of the AI in mimicking real-world customer behavior transitions.