Co-occurrences

Co-occurrences
 
Co-occurrence relation types can be created between two entity types that share a connection to at least one other entity type.
 
1

Relation Name

1. Relation Name
Enter the name of the co-occurrence dynamic relation type. 
  • The name should typically end in "Co-occurrence" or "Co-occ"
  • The name will be updated automatically based on the endpoints when you change the Entity 1 and Entity 2 dropdowns.
  • After you edit the relation type name, it will become bold, indicating that further changes to dropdowns will not change it.
     
2

Relation Endpoints

2. Relation Endpoints
Select the two entity types that are the endpoints of the co-occurrence.  A co-occurrence is an undirected relation type, so we call the endpoints Entity 1, Entity 2 rather than Src and Dst.
 
  • Co-occurrences can be created between the same two entity types (two senators) or two different entity types (a state and a bill).
  • When editing a co-occurrence that has already been created you cannot change the endpoints.  To do so the co-occurrence must be deleted and a new one can be created in its place.
 
3

Constituent Relations

3. Constituent Relations
 
The Based on Relations table lets you select which constituent relation types will be used to create the co-occurrence type.
 
At least one constituent relation check box must be selected.
 
A co-occurrence is formed by combining constituent relation types. 
 
In the image above, the "Senator Co-occurrence" has been created by combining "Bill Sponsor Relations" and "Bill Cosponsor Relations". Olympia Snowe and Mary Landrieu are two Senator type entities.  A senator-to-senator relation type was not directly present in any data table.  We create this relationship by connecting two senators with a co-occurrence any time they were both cosponsors of a bill, or when one of them is a sponsor, and the other a cosponsor.
 
Likewise there is no direct relationship between LA (Louisiana) and a bill sponsored by Mary Landrieu, but we can create that co-occurrence relation by combining the Senator-State and Senator-Bill relationships to form a State-Bill co-occurrence.
 
  • If Entity Type 1 and Entity Type 2 are the same (two senators), the constituent relations must have one endpoint (Src or Dst) be a Senator, and the other should be a different entity type (a bill).
  • If Entity Type 1 and Entity Type 2 are two different entity types (a bill and a state), the constituent relations must go from Entity Type 1 (a bill) to a third entity type (a senator), and also from the third entity type to Entity Type 2 (a state).
  • There can be more than one or two constituent type relations as long as they connect Entity Type 1 to Entity Type 2 through an intermediate types.  For instance a co-occurrence between senators can incorporate both Senator-Bill relation types and Senator-State relation types.
 
4

Dynamic Attributes

 
There are several dynamic attributes that can be created for dynamic relation types.  By default only Cooccurrence # and C-Rank are created.  The others need to be manually selected in the Dynamic Attributes dialog.
 
  • Co-occ #:  Co-occurrence # is the number of entities which make up the other endpoints of constituent relations emanating from the co-occurrences endpoints.  For instance, for a co-authorship co-occurrence between two authors we would count the number of documents that the two authors have written together, which would be their co-occurrence #.
     
  • C-Rank: C-Rank is a rank of co-occurrences by Co-occurrence #.  A rank of 1 indicates the strongest relationship and higher values of C-Rank indicate a weaker relationship.  For a co-occurrence between entities A and B, the C-Rank is computed by sorting co-occurrences between A and all entities of type B and looking at the order in which our co-occurrence appears.  We then compare this to the sorted list of co-occurrences for B and all entities of type A.  C-Rank is the minimum of these two numbers.
     
  • Signif: Significance is probability that A and B would have the co-occurrence # that they do divided by the expected value of the overlap between two entities having the same number of constituent relations as A and B but occurring at random.  I.e. it is the importance of the overlap rather than the size of the intersection.  For instance, if we have a list of 100 documents, and two authors which have written 50 each, then by random chance the authors should have written 25 together.  If they have co-authored more than 25 documents then the significance has a positive value.  If the two authors had each written 25 documents documents to begin with, and they again have a co-occurrence of 25 out of a pool of 100, then the significance of them having co-authored 25 documents is this case is much larger when they had written 50 documents each.
     
  • S-Rank: S-Rank is similar to C-Rank except it is computed by sorting co-occurrences by significance rather than by co-occurrence #.
     
5

Ok, Cancel

  • OK: Add the co-occurrence dynamic relation type or apply changes to the relation type and return to the Dynamic Relations page.
  • Cancel: Discard changes and return to the Dynamic Relations page.