There is a growing desire to create computer systems that can collaborate with humans on complex, open-ended activities. These activities typically have no set completion criteria and frequently involve multimodal communication, extensive world knowledge, creativity, and building structures or compositions through multiple steps. Because these systems differ from question and answer (Q&A) systems, chatbots, and simple task-oriented assistants, new methods for evaluating such collaborative computer systems are needed. Here, we present a set of criteria for evaluating these systems, called . The Hallmarks build on the success of heuristic evaluation used by the user interface community and past evaluation techniques used in the spoken language and chatbot communities. They consist of observable characteristics indicative of successful collaborative communication, grouped into eight high-level properties: robustness; habitability; mutual contribution of meaningful content; context-awareness; consistent human engagement; provision of rationale; use of elementary concepts to teach and learn new concepts; and successful collaboration. We present examples of how we used these Hallmarks in the DARPA Communicating with Computers (CwC) program to evaluate diverse activities, including story and music generation, interactive building with blocks, and exploration of molecular mechanisms in cancer. We used the Hallmarks as guides for developers and as diagnostics, assessing systems with the Hallmarks to identify strengths and opportunities for improvement using logs from user studies, surveying the human partner, third-party review of creative products, and direct tests. Informal feedback from CwC technology developers indicates that the use of the Hallmarks for program evaluation helped guide development. The Hallmarks also made it possible to identify areas of progress and major gaps in developing systems where the machine is an equal, creative partner.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8561722PMC
http://dx.doi.org/10.3389/frai.2021.670009DOI Listing

Publication Analysis

Top Keywords

systems
8
computer systems
8
hallmarks identify
8
hallmarks
7
assessing open-ended
4
open-ended human-computer
4
human-computer collaboration
4
collaboration systems
4
systems applying
4
applying hallmarks
4

Similar Publications

Genomic sequencing in diverse and underserved pediatric populations: parent perspectives on understanding, uncertainty, psychosocial impact, and personal utility of results.

Genet Med

January 2025

Genomics Ethics, and Translational Research Program, RTI International, Research Triangle Park, NC; Department of Translational and Applied Genomics, Kaiser Permanente Center for Health Research, Portland, OR. Electronic address:

Purpose: Limited evidence evaluates parents' perceptions of their child's clinical genomic sequencing (GS) results, particularly among individuals from medically underserved groups. Five Clinical Sequencing Evidence-Generating Research (CSER) consortium studies performed GS in children with suspected genetic conditions with high proportions of individuals from underserved groups to address this evidence gap.

Methods: Parents completed surveys of perceived understanding, personal utility, and test-related distress after GS result disclosure.

View Article and Find Full Text PDF

In this work, we successfully prepared four POM-based organic-inorganic hybrids, namely, [(CHN)(CHN)][PMoO] (1), [(CHN)(CHN)][PMoO] (2), [(CHN)][PMoO]·4HO (3), and [(CHN)][PMoO] (4) (where CHN = pyridine, CHN = pyrazine, CHN = 2,7-diamino-1,3,4,6,8,9-hexaazaspiro[4.4] nonane, and CHN = 3-amino-1,2,4-triazole), using a hydrothermal method. Compounds 1 and 2 exhibited a lamellar three-dimensional structure.

View Article and Find Full Text PDF

BaNDyT: Bayesian Network Modeling of Molecular Dynamics Trajectories.

J Chem Inf Model

January 2025

Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, 1218 S 5th Ave, Monrovia, California 91016, United States.

Bayesian network modeling (BN modeling, or BNM) is an interpretable machine learning method for constructing probabilistic graphical models from the data. In recent years, it has been extensively applied to diverse types of biomedical data sets. Concurrently, our ability to perform long-time scale molecular dynamics (MD) simulations on proteins and other materials has increased exponentially.

View Article and Find Full Text PDF

Background: Proton pump inhibitors (PPI) for gastroesophageal reflux disease (GERD) are associated with a high failure rate. Our uncontrolled feasibility study aimed determining the effect of a transcutaneous electrical stimulation system (TESS) on GERD symptoms and acid exposure time (AET).

Methods: Recruited patients with heartburn and regurgitation.

View Article and Find Full Text PDF

Metallic Zn is a promising anode for high-safety, low-cost, and large-scale energy storage systems. However, it is strongly hindered by unstable electrode/electrolyte interface issues, including zinc dendrite, corrosion, passivation, and hydrogen evolution reactions. In this work, an in situ interface protection strategy is established by turning the corrosion/passivation byproducts (zinc hydroxide sulfates, ZHSs) into a stable hybrid protection layer.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!