Opponent Aware Neural Quiz Player
- Improve NLP model using Deep-RL agent.
3 steps:
- GRU using Full time supervision(FTS) loss for content model.
- Opponent modelling using Deep-RL(Algorithm implemented: Double Deep Q Learning) for buzzing model.
- retrain GRU with weighted FTS, weights dictated by RL agent trained above.