Project Details
Description
The Athena project will develop technology called 'search as learning,' a set of search technologies that encourage and support learning rather than just simple document finding. In order to learn, searchers must engage with information that is both novel and understandable. Therefore, at the core, Athena will support learning by modeling several important factors: (1) the knowledge connections between documents covering a topic, (2) a user's current state of knowledge on that topic, (3) the types of knowledge a user is likely to gain from a document, and (4) the knowledge required for a user to successfully engage with a document. The Athena project will involve two types of end-to-end systems, both of which will model and leverage the learner's state of knowledge (LSK): an LSK-aware search engine and an LSK-aware question answering system. The Athena systems will guide a user through a topic and find relevant information in the context of previously encountered information and the topic structure captured in a web of topics. The team will evaluate Athena using standard measures as well as a series of studies involving human subjects. If the Athena project is successful, it will make it easier for people to use search engines and related technologies to learn about complex topics, where there are numerous interrelated and dependent subtopics that should be considered. Given that search is among the most common online activities on and off the Web, Athena and its technologies will have a substantial impact on searchers trying to learn such topics.
Athena enables 'search as learning' using a data structure referred to as a Learning Flow Graph (LFG). An LFG comprises nodes that represent sub-topics (e.g., concepts) within a given domain and vertices that represent relations between sub-topics (e.g., one sub-topic being foundational to understand another). Athena leverages LFGs to model the different factors mentioned above. It uses probability distributions across nodes in an LFG to model: (1) a user's knowledge state, (2) the potential knowledge gains from an information item, and (3) the prerequisite knowledge required for a user to successfully engage with an information item. The Athena team will develop algorithms for generating LFGs from structured and semi- and unstructured resources (e.g., course syllabi, tables of contents, book indices, knowledge bases, query logs), algorithms for integrating LFGs into search and question-answering models, and algorithms for re-estimating LFGs and a user's knowledge state based on search behaviors (e.g., queries, clicks, skips, dwell times, etc.). Structuring textual data to find the optimal learning paths through it is of great interest, though most existing work has focused on extracting information to fill slots in a 'knowledge base,' a much finer grained task. The LFG representation also provides a type of explanation of a larger topic, connecting to the broad interest in explainable systems. The Athena work will extend the state of the art in text representation, neural approaches including attention techniques, query and topic modeling, contextual text summarization, and understanding human approaches to complex search activities.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Status | Active |
---|---|
Effective start/end date | 1/10/21 → 30/9/24 |
Links | https://www.nsf.gov/awardsearch/showAward?AWD_ID=2106334 |
Funding
- National Science Foundation: US$240,530.00
ASJC Scopus Subject Areas
- Information Systems
- Computer Science(all)