← Older revision
Revision as of 19:55, 2 January 2014
Line 1:
Line 1:
−
'''Cyc'''
– projekt z dziedziny
[[
Sztuczna inteligencja
|
sztucznej inteligencji
]] (
AI
),
mający na celu stworzenie kompletnej bazy wiedzy, tak zwanego
''
zdrowego rozsądku
''
. Ma to stanowić podstawę
,
która umożliwi programom AI
,
przeprowadzanie rozumowania podobnego do ludzkiego
.
+
'''Cyc'''
is an
[[
List of notable artificial intelligence projects
|
artificial intelligence project
]]
that attempts to assemble a comprehensive [[ontology
(
computer science
)
|ontology]] and [[knowledge base]] of everyday [[common sense knowledge]]
,
with the goal of enabling [[artificial intelligence|AI]] applications to perform human-like reasoning.
+
+
The project was started in 1984 by [[Douglas Lenat]] at [[Microelectronics and Computer Technology Corporation|MCC]] and is developed by the Cycorp company.
+
Parts of the project are released as
''
'OpenCyc'
'',
which provides an API
,
[http://sw.opencyc.org/ RDF endpoint], and [[data dump]] under an [[open source]] license
.
{{Infobox info
{{Infobox info
|Nazwa=Cyc
|Nazwa=Cyc
Line 9:
Line 12:
|Strona internetowa=[http://www.cyc.com www.cyc.com]
|Strona internetowa=[http://www.cyc.com www.cyc.com]
}}
}}
−
Projekt został zapoczątkowany w
[[
1984
]]
roku
,
przez dr
[[
Doug Lenat|Douga Lenata
]].
Mimo iż nazwa
"Cyc" (
czyt.
''
sajk
'')
pochodzi od angielskiego słowa
"
encyclopedia
" ([[
encyklopedia
]]
)
, to
baza wiedzy tworzona w ramach tego projektu zawiera dużo więcej informacji o opisywanych w niej obiektach
,
niż tylko proste definicje
.
Struktura bazy wiedzy pozwala na automatyczne przeprowadzenie rozumowania i wyciąganie wniosków
.
Wstępnie projekt był planowany na 10 lat
,
jednak po dziś dzień nadal jest aktywnie rozwijany i trudno powiedzieć czy zakończy się sukcesem
.
Obecnie
Cyc
jest własnością korporacji
''
Cycorp
''.
Jednym z pierwszych praktycznych zastosowań systemu jest
[[
CycSecure
]],
który bada bezpieczeństwo rzeczywistej
[[
Sieć komputerowa
|
sieci komputerowej
]]
przeprowadzając symulacje ataków na tę sieć
.
+
==Overview==
+
The project was started in 1984 as part of
[[
Microelectronics and Computer Technology Corporation
]]
. The objective was to codify
,
in machine-usable form, millions of pieces of knowledge that compose human common sense. CycL presented a proprietary knowledge representation schema that utilized first-order relationships.
+
The Cyc Project was spun off into Cycorp, Inc. in
[[
Austin, Texas
]]
in 1994
.
+
+
The name
"Cyc" (
from "encyclopedia", pronounced {{IPA|[saɪk]}} like
''
syke
'')
is a registered trademark owned by Cycorp. The original knowledge base is proprietary, but a smaller version of the knowledge base, intended to establish a common vocabulary for automatic reasoning, was released as OpenCyc under an [[open source]] (Apache) license. More recently, Cyc has been made available to AI researchers under a research-purposes license as [[ResearchCyc]].
+
+
Typical pieces of knowledge represented in the database are
"
Every tree is a plant
"
and "Plants die eventually". When asked whether trees die, the inference engine can draw the obvious conclusion and answer the question correctly. The Knowledge Base
(
KB) contains over one million human-defined assertions, rules or common sense ideas. These are formulated in the language
[[
CycL
]],
which is based on [[predicate calculus]] and has a [[syntax]] similar
to
that of the [[Lisp programming language]].
+
+
Much of the current work on the Cyc project continues to be [[knowledge engineering]]
,
representing facts about the world by hand, and implementing efficient inference mechanisms on that knowledge
.
Increasingly, however, work at Cycorp involves giving the Cyc system the ability to communicate with end users in [[natural language]], and to assist with the [[knowledge formation]] process via [[machine learning]]
.
+
+
Like many companies
,
Cycorp has ambitions to use the Cyc [http://www.cyc.com/cyc/cycrandd/areasofrandd_dir/cycrandd/nlu natural language understanding tools] to parse the entire internet to extract structured data.
+
+
In 2008, Cyc resources were mapped to many [[Wikipedia]] articles, potentially easing connecting with other open datasets like [[DBpedia]] and [[Freebase (database)|Freebase]]
.
+
+
==Knowledge base==
+
The concept names in
Cyc
are known as
''
constants
''.
Constants start with an optional "#$" and are case-sensitive. There are constants for:
+
* Individual items known as ''individuals'', such as #$BillClinton or #$France.
+
* ''Collections'', such as #$Tree-ThePlant (containing all trees) or #$EquivalenceRelation (containing all
[[
equivalence relation
]]
s). A member of a collection is called an ''instance'' of that collection.
+
* ''Truth Functions'' which can be applied to one or more other concepts and return either true or false. For example #$siblings is the sibling relationship
,
true if the two arguments are siblings. By convention, truth function constants start with a lower-case letter. Truth functions may be broken down into logical connectives (such as #$and, #$or, #$not, #$implies), quantifiers (#$forAll, #$thereExists, etc.) and
[[
Predicate (logic)
|
predicate
]]
s.
+
* ''Functions'', which produce new terms from given ones. For example, #$FruitFn, when provided with an argument describing a type (or collection) of plants, will return the collection of its fruits. By convention, function constants start with an upper-case letter and end with the string "Fn".
+
+
The most important predicates are #$isa and #$genls. The first one describes that one item is an [[Instance (computer science)|instance]] of some collection, the second one that one collection is a subcollection of another one. Facts about concepts are asserted using certain CycL ''sentences''. Predicates are written before their arguments, in parentheses:
+
(#$isa #$BillClinton #$UnitedStatesPresident)
+
"Bill Clinton belongs to the collection of U.S. presidents" and
+
(#$genls #$Tree-ThePlant #$Plant)
+
"All trees are plants".
+
(#$capitalCity #$France #$Paris)
+
"Paris is the capital of France."
+
+
Sentences can also contain variables, strings starting with "?". These sentences are called "rules". One important rule asserted about the #$isa predicate reads
+
(#$implies
+
(#$and
+
(#$isa ?OBJ ?SUBSET)
+
(#$genls ?SUBSET ?SUPERSET))
+
(#$isa ?OBJ ?SUPERSET))
+
with the interpretation "if OBJ is an instance of the collection [[subset|SUBSET]] and SUBSET is a subcollection of [[superset|SUPERSET]], then OBJ is an instance of the collection SUPERSET". Another typical example is
+
(#$relationAllExists #$biologicalMother #$ChordataPhylum #$FemaleAnimal)
+
which means that for every instance of the collection #$ChordataPhylum (i.e. for every [[chordate]]), there exists a female animal (instance of #$FemaleAnimal) which is its mother (described by the predicate #$biologicalMother).
+
+
The [[knowledge base]] is divided into ''microtheories'' (Mt), collections of concepts and facts typically pertaining to one particular realm of knowledge. Unlike the knowledge base as a whole, each microtheory is required to be free from contradictions. Each microtheory has a name which is a regular constant; microtheory constants contain the string "Mt" by convention. An example is #$MathMt, the microtheory containing mathematical knowledge. The microtheories can inherit from each other and are organized in a hierarchy:
+
one specialization of #$MathMt is #$GeometryGMt, the microtheory about geometry
.
−
Baza danych – tzw
.
''baza wiedzy''
(
ang. Knowledge base – KB) – jest napisana w języku CycL
,
który trochę przypomina język
[[
Lisp
]]
. Programiści CycL nazywani są z angielska "cyclists". Podstawowymi elementami składowymi bazy danych są tzw. ''stałe'' (ang. ''constants''). Można je podzielić na kilka podstawowych grup: elementy indywidualne – koncepty (np. #$Poland
,
#$HomerSimpson), kolekcje (np. #$Tree-ThePlant – jako kolekcja wszystkich drzew), operatory logiczne (np. #$
and
, #$implies
)
, kwantyfikatory (np. #$forAll), predykaty (np. #$isa, #$genls) i funkcje (np. #$FruitFn). Wszystkie ''stałe'' są połączone z innymi stałymi przez predykaty i należą do tzw. ''mikro-teorii'', które muszą być wewnętrznie niesprzeczne. Każda mikro-teoria jest identyfikowana przez stałą
.
+
==Inference engine==
+
An [[inference engine]] is a computer program that tries to derive answers from a knowledge base
.
+
The Cyc inference engine performs general [[logical deduction]]
(
including [[modus ponens]]
, [[
modus tollens
]],
[[universal quantification]]
and
[[existential quantification]]
).
−
Cyc obecnie jest dostępny za darmo w okrojonej wersji nazwanej OpenCyc. Dodatkowo dostępna jest również wersja ResearchCyc, która jest udostępniana naukowcom i instytucjom badawczym, również za darmo.
+
==Releases==
−
===
Linki zewnętrzne
===
+
===
OpenCyc
===
−
* [http://cyc
.
com Oficjalna strona Cycorp
,
Inc
.
]
+
The latest version of OpenCyc, 4
.
0
,
was released in June 2012
. OpenCyc
4
.
0 includes the entire Cyc ontology containing hundreds of thousands of terms, along with millions of assertions relating the terms to each other; however, these are mainly taxonomic assertions, not the complex rules available in Cyc
.
The knowledge base contains 239,000 concepts and 2,093,000 facts and can be browsed on the OpenCyc website
.
−
* [http://opencyc.org Strona projektu
OpenCyc
] – do pobrania wersje dla [[Linux]] i [[Microsoft Windows|Windows]]
+
−
* [http://video
.
google
.
com/videoplay?docid=-7704388615049492068 Wykład na temat Cyca] – prowadzony przez dr Douga Lenata (google video)
.
+
+
The first version of OpenCyc was released in spring 2002 and contained only 6,000 concepts and 60,000 facts. The knowledge base is released under the [[Apache License]]. [[Cycorp]] has stated its intention to release OpenCyc under parallel, unrestricted licences to meet the needs of its users. The [[CycL]] and [[SubL]] interpreter (the program that allows you to browse and edit the database as well as to draw inferences) is released free of charge, but only as a binary, without source code. It is available for [[Linux]] and [[Microsoft Windows]]. The open source Texai project has released the [[Resource Description Framework|RDF]]-compatible content extracted from OpenCyc.
+
[[Category:Common Lisp software]]
+
[[Category:Ontology (information science)]]
+
[[Category:Knowledge bases]]
+
[[Category:Artificial intelligence]]
+
[[Category:Open data]]
[[Kategoria:Sztuczna inteligencja]]
[[Kategoria:Sztuczna inteligencja]]
[[Kategoria:Bazy danych]]
[[Kategoria:Bazy danych]]