Artificial intelligence (AI) — glossary

By Marie Lebert, 16 April 2021.

While learning about artificial intelligence (AI), I wrote a short glossary for my colleagues. Wikipedia was very useful to enlighten me, as always. Please see also the glossary on computational linguistics.

AI / artificial intelligence
refers to the ability of machines to operate independently to perform tasks and activities that typically require the intelligence of humans; used in many fields such as (in alphabetical order) automotive, e-commerce, financial services, healthcare (biotech, medtech), government, manufacturing, retail, robotics, security and transportation

AI worker
human worker annotating, categorising, classifying, collecting, enriching, evaluating, labelling, rating, testing, translating, transcribing or validating data to train machine learning algorithms for AI applications; often recruited on a crowdsourcing marketplace to perform specific tasks or microtasks that cannot be automated

automated instruction or set of rules used in mathematics and computer science

application programming interface / API
set of tools and resources in an operating system to create software applications

audio annotation
transcription and time stamping of audio and speech data, including pronunciation, intonation, identification of language and dialect, and speaker demographics

augmented reality / AR
technology that superimposes a computer-generated perceptual information (visual, auditory, sensory, olfactory) on a real-world environment

autonomous vehicle / AV
vehicle capable of sensing its environment and moving safely with little or no human input and with advanced control systems that interpret sensory information to identify navigation paths, obstacles and signage

autonomous vehicle data
data that uses computer vision (CV) to automatically recognise people crossing the road, sidewalks, road signs and other features; involves training data to teach algorithms after collecting image and video information, for example drawing a bounding box around a pedestrian, another vehicle, a traffic sign and a lane marking

big data
dataset that is too complex for standard data processing, for example big data obtained from user-generated content on social media sites and apps

body measurements and calculations used to label and describe individuals, for example fingerprint, face recognition, DNA, palm print, hand geometry, and retina and iris patterns; used in biometric recognition, authentication and verification

computer program that answers specific replies to specific questions, mostly embedded in a website or mobile app; simulates typical conversation threads for routine questions from users; retrieves information from training data (trained for example from the company’s FAQ, hard-coded answers, customer support chat scripts, emails and call logs) or from a larger content base using machine learning

computational linguistics
computer processing of natural language (written and spoken) for analysis and synthesis of language and speech; includes spell and grammar checkers, machine translation, speech synthesis, speech recognition, virtual assistants and smart speakers [glossary]

computational science
multidisciplinary field using computing capabilities for science

computer science
study of computers (hardware, software, networks) and computing concepts

computer vision / CV
extracts data from digital images and videos (2D and 3D images, point clouds, video sequences, views from multiple cameras) to process and analyse such data in order to automate tasks at a much larger scale and speed than the human visual system; used in facial recognition, autonomous vehicles, drones, medical imaging and surgery robotics

content analysis
process of studying digital data (text, images, audio, video) and communication patterns in a systematic manner

content moderation
human-powered and automated process of monitoring, assessing and filtering user-generated content (text, images, audio, video); includes deduplication (removing duplicate content), image quality (checking image subject, caption and metadata), harmful content detection (detecting violent, cruel and criminal activity), profanity detection (detecting coarse language and hate speech) and spam filtering

conversational AI
includes the following sequences: (a) speech-to-text (STT) conversion, i.e. converting the user audio file into text, (b) natural language understanding (NLU), i.e. analysing and processing text to create actionable instructions; (c) content relevance, i.e. returning relevant information to the user

conversational AI agent
can be a chatbot, a virtual assistant, a smart speaker or a smart home app; includes the following sequences: (a) data input, i.e. capturing commands or questions from users in an audio file that is converted into text; (b) natural language understanding (NLU), i.e. using entity extraction and intent recognition to interpret the text file; (c) dialogue management, i.e. cleaning the dialogue with dialogue state tracking; (d) natural language generation (NLG), i.e. converting the structured data into natural language; (e) data output, i.e. converting the natural language text data into audio output

conversational interface
interface that uses natural language processing (NLP) and natural language understanding (NLU) to run a conversation with a user, for example a voice assistant or a chatbot

process of harnessing the skills and knowledge from a large group of people to achieve a cumulative result (for example a marketplace for AI workers) or a collaborative service (for example an online encyclopedia such as Wikipedia)

crowdsourcing marketplace
crowdsourcing platform that allows businesses (requesters) to remotely hire workers (contractors) around the world to do specific tasks or microtasks that cannot be automated, for example training data for AI systems

data can be text, alphanumeric, images, audio, video, geo-local, speech, natural language, URLs, sensors and point clouds; training data to improve AI algorithms includes (by alphabetical order) data analysis, data annotation, data categorisation, data cleansing, data collection, data enrichment, data entry, data extraction, data labelling, data matching, data mining, data parsing, data tagging, data validation and data verification

data annotation
human-annotated and automated labelling of data (text, images, audio, video and more) to train machine learning models for AI applications; human-annotated tasks include for example word labelling for text and object labelling for images

data extraction
process of retrieving data from raw data for further data processing; in healthcare for example, data is retrieved from patient records and medical imaging (x-rays, CT scans, MRI scans, microscopic images) to detect, characterise and monitor diseases such as skin cancer or a brain tumor

data labelling
process of converting unlabelled data (text, images, audio, video, geo-local) into training data to improve AI algorithms; includes data annotation, data tagging and data classification

data mining
process of extracting data from large datasets for machine learning

data modelling
process of creating a data model that organises data and standardises how they relate to each other

data science
field that uses statistics, data analysis and machine learning to extract knowledge from data

process of converting code symbols back into useful information for users, for example information expressed in a natural language

deep learning
subset of machine learning capable of unsupervised learning from data that is unstructured or unlabelled, instead of learning from task-specific algorithms

single key descriptor (person, country, city, organisation, topic, product and more)

entity linking
involves locating and disambiguating named entities (for example people and places) through the use of a knowledge database; used to add metadata to the training data in order to improve an AI algorithm

ethical AI
respect of human rights, diversity and privacy in AI systems; includes the behaviour of humans as they design, build and use AI systems, and the behaviour of machines using AI; issues include for example racial and gender algorithmic bias in facial and voice recognition technology, model bias in data training of AI systems, and privacy concerns for personal data and surveillance

face landmarking
detection and localisation of keypoints (landmarks) on the human face for biometric recognition; used for example in face detection for social media apps, face verification for smartphones, and facial emotion recognition for sentiment analysis

facial recognition
biometric technology matching a human face from a digital image or video against a database of human faces, i.e. pinpointing and measuring facial features to authenticate a user; used in ID verification services, video surveillance and automatic indexing of images

General Data Protection Regulation / GDPR
European Union (EU) regulation setting guidelines for collecting and processing personal information (privacy standards, data protection, network security) from individuals living in the European Economic Area (EEA); applies to any company or organisation attracting European visitors regardless of its location

geo-local data evaluation
process of verifying and enhancing raw location data for input into maps and navigation software; includes human-annotated tasks such as verifying driving, cycling and walking directions, point-of-interest (POI) tagging (for example buildings and geographical landmarks), adding local business information, and verifying and updating addresses

alphabetical list of terms relating to a specific subject, field, language or dialect, with a definition of each term

set of linguistic rules allowing for the combination of words into sentences; includes morphology (grammar of word forms) and syntax (grammar of sentence structure)

graphical user interface / GUI
interface which allows users to interact with other users through graphical icons and visual indicators

hot word
word providing hands-free activation of a voice-command device

human-computer interaction / HCI
design and development of interfaces between users and computers, for example chatbots, voice assistants, search analysis and sentiment analysis

human in the loop / HITL
branch of machine learning that uses both human judgement and automation to create machine learning models: the algorithm is trained, tuned and tested by humans, who then feed the tasks back into the algorithm to make it better before retraining, testing and validating the model; used in natural language processing (NLP), search and information retrieval, computer vision (CV) and sentiment analysis

human intelligence task / HIT
term originally used on Amazon’s crowdsourcing marketplace to define a specific task done by a human worker, for example image tagging and labelling, data cleansing and verification, web content rating, audio transcription, and survey data collection

image annotation
process of adding identifier labels such as keywords, metadata, shapes (bounding boxes, cuboids, elipses, polygones, keypoints, lines, splines, linear interpolation), semantic segmentation (linking each pixel to a class label) and landmark annotation (labelling keypoints) for object detection and localisation

image transcription
includes digitising the text within an image and image captioning in order to build training datasets for machine learning models such as optical character recognition (OCR) models

creation of an alphabetical list to locate data in a dataset

inference engine
system component that applies logical rules to the knowledge base in order to deduce new information

information system
organised system for collecting, storing, classifying and communicating information

intent annotation
identifies user intent (request, command, booking, recommendation, confirmation) and user mood (happy, neutral, frustrated) in order for an AI application to respond accordingly

intent variation
refers to the different ways various users have to express the same intent in spoken or written form when they use a chatbot, a voice assistant or a search engine, i.e. how their question is phrased, for the AI model to extract the relevant information and understand the query

Internet of Things / IoT
network of physical devices, home appliances, vehicles, wearable devices and other items embedded with electronics, software, sensors, actuators (movers) and connectivity in order to collect and exchange data

knowledge base / KB
database that stores complex structured and unstructured information used by a computer system; entity linking (i.e. linking to key descriptors like people and places) will link to a knowledge base, for example Wikidata

knowledge graph
knowledge base that uses a graph-structured data model to integrate data; used for example to store interlinked descriptions of entities and abstract concepts

knowledge management / KM
process of creating, using, sharing and managing the information relating to a company or an organisation

custom list of words used to train an algorithm for a natural language processing (NLP) application such as content moderation, speech synthesis or sentiment analysis; includes ontology creation (domain-specific or language-specific), pronunciation dictionary development, and corpora generation for text, images, audio or video

acronym of “light detection and ranging”; a combination of 3D scanning and laser scanning to make 3D representations of geographical areas, including high-resolution maps used for autonomous vehicles

linguistic annotation
tagging of language data (text and audio) to identify grammatical, phonetic and semantic elements; includes part-of-speech (POS) tagging (labelling words based on their relation with adjacent and related words), phonetic annotation (labelling of intonation, stress and natural pauses), semantic annotation (tagging words with metadata, including keywords and keyphrases) and discourse annotation (linking words to their antecedent and postcedent subjects)

linguistic rule development
refers to a set of rules defined by humans to overcome ambiguity or imprecision in human language for natural language processing (NLP) applications, for example speech recognition in virtual assistants

set of parameters that defines a user’s language (language identifier) and region (region identifier) in a user interface

adaptation of a translated product or information to a specific country, region or linguistic community for adaptation to its market and customs

machine learning
set of algorithms that is fed with structured data in order to complete a task; algorithms learn from training data to make data-driven decisions and predictions instead of following static program instructions

machine translation / MT
translation of text or speech from one natural language to another by a computer program; can be followed by light or heavy human post-editing by native speakers depending on cost, quality and speed requirements; can also be assessed by native speakers for quality assurance (QA), i.e. naturalness and relevance, for machine translation retraining

medical image data
data from medical imaging (x-rays, CT scans, MRI scans, microscopic images) used in computer vision (CV) models, for example to detect skin cancer, brain tumours and other diseases

data providing information about other data, for example identifiers, keywords and captions for images

mathematical algorithm that is trained using both training data and human expert input to replicate a decision process and enable automation; after the validation of a model dataset, the model is experimented, built, validated, deployed and retrained

named entity recognition / NER
process of identifying and tagging named entities such as key descriptors (for example people or places) for e-commerce and social media companies

natural intelligence
intelligence displayed by humans (and animals); term used as a contrast to artificial intelligence

natural language
human language, that basically consists of a lexicon (list of words) and a grammar (for the combination of words into sentences); the two main components used in AI are syntax (the set of rules that govern the structure of sentences) and semantics (the meaning conveyed by text or speech); natural language is broken down in shorter segments in order to understand relations between the segments and how they connect to create meaning

natural language processing / NLP
technology used to teach computers to understand and process natural language (written and spoken) in order to generate responses in a human-like manner; uses human-annotated and automated training data for an AI algorithm to be able to read text, to understand speech and to analyse sentiment; used in text-to-speech (TTS) applications, personal assistants, chatbots, search queries, social media analytics, machine translation, information extraction and more

natural language understanding / NLU
subfield of natural language processing that includes intent definition (goal or purpose of human request), utterance collection (i.e. various utterances for the same intent) and entity extraction (for example people or places); used in search engine optimisation, news gathering, text categorisation, voice activation, large-scale content analysis, automated customer service, online education and more

network science
study of complex networks such as computer networks, telecommunication networks, cognitive networks, semantic networks and social networks

neural machine translation / NMT
machine translation that uses an artificial neural network to predict the likelihood of a sequence of words, typically modelling entire sentences in a single integrated model; is a combination of AI and statistical machine translation (SMT)

neural network / NN
also called artificial neural network (ANN), and inspired by the biological neural network; series of algorithms meant to recognise underlying relations in a set of data through a process that mimics the way the human brain operates

online rating
human-annotated and automated task consisting in evaluating search engine results (text, images, videos), online ads and digital maps in order to improve their content, relevance and quality

definition of concepts and entities in a subject area, with their interdependent properties and relations, according to a system of categories; often created automatically from a large dataset

open data
data that are freely available for everyone to use, re-use and redistribute without restrictions from copyright or patents

optical character recognition / OCR
technology that converts image files (scans, photographs) of original text (printed, typed, handwritten) into text files that can be edited, searched and indexed; in the case of handwritten text (sometimes dating back hundreds of years), a training dataset can be built for OCR to automatically understand and process such data

process of dividing a sentence into grammatical parts to annotate its syntactic and/or semantic structure

part of speech / POS
category of words with similar grammatical properties, with nine main parts of speech for the English language (noun, verb, article, adjective, preposition, pronoun, adverb, conjunction, interjection), and 50 to 150 sub-categories for part-of-speech tagging depending on the program used

part-of-speech tagging / POST
process of marking up a word on a particular part of speech in order to study its use in context, i.e. in relation with adjacent and related words in a phrase, sentence or paragraph

personal data
any personal information relating to an identifiable person (name, home address, identification card number, medical data, biometrics, etc.); ideally collected, handled, stored and accessed securely according to a specific regulation, for example GDPR (General Data Protection Regulation) for individuals living in the European Economic Area (EEA)

collection of ready-made phrases, often in the form of indexed questions and answers

pixel-level semantic segmentation / PLSS
labelling of images pixel by pixel for a computer vision (CV) model

point cloud data
set of data points in space to represent a 3D shape or object; provided by a 3D scanner, photogrammetry software, Lidar (light detection and ranging) , Radar (geofencing platform), and other types of scanners or sensors

pronunciation lexicon
lexicon used for speech systems to correctly recognise and pronounce words for greater accuracy when interacting with users; can be general or domain specific

proofing tool
application highlighting mistakes in grammar, spelling, punctuation and word choice to help a user write accurately; uses data training for linguistic annotation (identifying grammatical, phonetic and semantic elements), grammar creation (writing syntactic rules) and ontology creation (defining concepts and entities)

quality assurance / QA
process that includes testing and auditing to ensure the accuracy of an AI model

question answering / QA
system that automatically answers questions asked by a user in a natural language; involves information retrieval and natural language processing (NLP)

relational database
database based on a model that manages data as a set of relations

machine that can help and assist humans, substitute for humans and replicate human actions; includes conversational robots, drones, medical robots (to assist surgical procedures with computer vision) and social robots (to provide services with speech and gesture)

field that integrates computer science and engineering to design, build, operate and use robots

computer program written for a special run-time environment to automate the execution of tasks

search engine relevance
process of ranking search results — either specific results or an entire listing — in order to improve user experience and satisfaction; includes ads, multimedia, news feeds, social media feeds, map verification and geo-local data

semantic analysis
human-annotated and automated analysis that includes named entity recognition (i.e. identifying entities such as people and places) and word sense disambiguation (i.e. giving meaning to a word based on context)

semantic annotation
includes improving search relevance by taking into account the user’s intent and context
or creating a text corpus to train a chatbot, for example a product listing with key descriptors

semantic segmentation
also named image segmentation; task of identifying different classes of objects in an image and classifying these objects into semantic categories (for example people or places)

study of the relations between form and meaning at the level of words, phrases, sentences and larger units of discourse in natural language (written and spoken)

device, module, machine or subsystem whose purpose is to detect changes or events in its environment and to send the resulting information to other electronics; applications include manufacturing, aerospace, autonomous vehicles, robotics and more

sensor fusion
process of combining sensory data from different sources to improve the resulting information, for example to obtain a more accurate location in aircraft navigation

sentiment analysis
human-annotated and automated assessment of the attitudes, emotions and opinions of users; includes the identification and study of subjective information (user surveys, customer feedback forms, news media, social media posts) to extract particular words or phrases in order to understand user tone (positive, negative, neutral) and user sentiment (satisfaction, anger, sarcasm)

smart labelling
refers to the use of machine learning in the data annotation process in order to automate and improve the productivity, quality and delivery of a dataset

smart speaker
voice command device (VCD) with an integrated virtual assistant that can relay information, stream audio content and communicate with other devices, for example to control home automation devices

social media analytics
human-annotated and automated task used for example to track user sentiment about brands and products, to filter out fake news and to detect political bias

speaker recognition
identification of users from their voice biometrics

vocal communication using natural or artificial language

speech dataset
dataset used to train voice-prompted virtual assistants, voice-activated search functions, voice-to-text capabilities and more

speech recognition
also known as automatic speech recognition (ASR); technology that converts spoken language into text in order to process speech data for voice-activated applications; consists in segmenting speech data into layers, speakers and time stamps; bias can be avoided by recording a broad range of participants (age, gender, ethnicity, language, dialect, accent, environment)

speech synthesis
artificial production of human-like speech by a computer program

branch of mathematics dealing with data collection, organisation, analysis, interpretation and presentation

subset of a natural language, a computer language or a relational database

syntactic analysis
human-annotated and automated task that includes language segmentation, lemmatisation (reducing a word to its base and grouping similar-based words), part-of-speech tagging, and stemming (removing affixes and suffixes of words to obtain root words)

elementary constituent segment within a text, for example a phoneme, word, phrase or sentence

arrangement of words in a sentence so that they make grammatical sense; study of the language structure (formation and composition of phrases and sentences) in order to describe how structural relations between elements in a sentence (often depicted in parse tree format) contribute to its interpretation; set of rules that define a structured computer program

science of classification of things and concepts meant to improve relevance in vertical search, for example a web search query

temporal segmentation
segmentation of human motion into actions, for example a vehicle parked, moving, in the same lane or changing lanes; the shapes used for segmentation are polygons, bounding boxes, cuboids, key points, lines and arrows in order to train data and build AI systems for autonomous vehicles

group of specialised terms (words and compound words) relating to a particular field, and the study of such terms, their development and their use

text annotation
human-annotated and automated tagging of text data with keywords and keyphrases in order to train machine learning models; includes sentiment annotation (attitudes and emotions such as positive, negative or neutral), intent annotation (request, command, confirmation) and semantic annotation (concepts and entities such as people, places and topics)

text corpus
structured set of texts for storage and processing; can be for example a monolingual corpus, a multilingual corpus, a translation corpus (texts and their translations), a parallel corpus (texts alongside their translations) or a comparable corpus (texts covering the same contents)

text processing
creation and manipulation of electronic text to detect duplicates, coherence and language; includes text summarisation, text analysis of content, sentiment and intent (to better understand customer queries and interactions), and entity extraction (to improve the cognitive ability of a machine learning model)

text summarisation
task of condensing the text into a synopsis with machine learning, for example for a news aggregator; extractive text summarisation pulls keyphrases from a text and uses them to create a synopsis; abstractive text summarisation aims to understand the meaning behind the text and to communicate it into newly generated sentences

listing of words grouped and classified according to similarity of meaning, with a controlled vocabulary organising semantic metadata for information storage and retrieval

time stamping
refers to the date and time information attached to any digital media, for example a timestamp indicating when a text file was last modified, or a timestamp indicating when a digital picture was taken

training data
human-labelled data (text, alphanumeric, images, audio, video, geo-local, speech, natural language, URLs, sensors, point clouds) used to train an algorithm for an AI system to make accurate predictions when presented with new data in real-world applications

training data / linguistic tasks
includes human-powered and automated tasks such as (in alphabetical order) building parallel corpora for machine translation, dictionary building, linguistic annotation, linguistic rule development, localisation, machine translation quality evaluation, mass translation, pronunciation lexicon development, speech data collection, text corpus annotation, translation and voice recording

training data / tasks (other than linguistic)
includes human-powered and automated tasks such as (in alphabetical order) ad evaluation (relevance, rating, intent, context), application testing, audio classification, audio transcription, chatbot training, content comparison, content evaluation (web and social media), content rating, content moderation (assessing and filtering content), data analysis, data annotation, data categorisation, data cleansing (removing duplicates), data collection, data enrichment, data entry, data extraction, data labelling, data mining, data matching, data parsing, data tagging, data validation, data verification, entity annotation, entity linking, geo-local data evaluation, image annotation, image labelling, image tagging, image transcription, intent classification, intent recognition, intent variation, map data analysis, product categorisation, product testing, search evaluation, search rating, sentiment analysis, survey data collection, text classification, text summarisation, video transcription and video tagging

usability testing
testing a product with users in real-world situations to ensure a smooth interaction, for example testing a chatbot with users that have different accents and different ways of saying the same thing to ensure all of them get a proper response

user interface / UI
can be a command-line interface (CLI) or a graphical user interface (GUI) for human-computer interaction

video annotation
human-annotated and automated task that includes splitting video into frame-by-frame sequences, object detection, object identification, object labelling, object tracking (temporal and spatial features), video transcription and time stamping

virtual assistant
voice-enabled software application that uses speech recognition, speech synthesis and natural language processing (NLP) to respond to user queries and provide services in reaction to voice commands, for example answering routine questions, setting calendar events and playing music

virtual reality / VR
technology that replaces the user’s real-world environment with a computer-generated simulation of a three-dimensional environment that be accessed with electronic equipment such as a helmet with a screen or gloves with sensors

voice biometrics
technology that uses a person’s voice as a uniquely identifying biological characteristic in order to authenticate and verify the identity of a speaker as part of a security process

voice recognition
refers to speech recognition (recognition of spoken language by computers, i.e. determining what is being said) or speaker recognition (authentication and verification of the identity of a speaker, i.e. determining who is speaking)

voice tag
short audio phrase used as a command for a voice command device (VCD) or a voice user interface (VUI)

wearable device
smart electronic device with micro-controllers that can be incorporated into clothing, worn as an accessory (for example a smartwatch or a fitness tracker) or worn on/in the body as an implant

writing system
method of visually representing verbal communication by converting spoken language into visual symbols for a wider communication across space and time

Copyright © 2021 Marie Lebert
License CC BY-NC-SA version 4.0

Written by marielebert

2021-04-16 at 18:34

Posted in Uncategorized