Dual Database Architecture:

    • The Dual Database Architecture refers to the Knowledge Base of identities separated from the federated agency-specific data.
    • The Knowledge Base (KB) is composed of two parts: the Knowledge-based Identity Management (KIM) system and a Trusted Identifier Manager (TIM).

    Knowledge-based Identity Management (KIM) System:

    • Kept offline (powered down) unless actively processing data.
    • Able to keep track of multiple representations of the same person.
    • Enables higher accuracy in matching.
    • Protects privacy by re-using non-personal references in reports and studies.
    • Without a KB, PII must be pulled and transmitted for each data request.
    • Enables longitudinal research by providing a single path for data flow.
    • Limits the number of copies of identity data being transmitted to one.
    • Completely removes any agency data, creating a wall of separation.
    • Replaces personal information with a hashed cluster ID (KIM ID).
    • Once the data have been processed by KIM, the personal information is no longer needed for research requests or for matching across agency data.

    Trusted Identifier Manager (TIM):

    • After receiving a KIM ID, TIM assigns an agency-specific research ID.
    • Agency records cannot be matched without approval from the agencies.
    • When records from two state agencies require linking, TIM creates a temporary crosswalk and a research-specific substitute ID.
    • The server that contains KIM/TIM is offline unless actively processing data.

    Federated Agency-Specific Anonymized Data:

    • Research data is on a physically separate server from personal information.
    • Each agency retains ownership of its data, each agency has its own set of research IDs, and the agency data cannot be linked elsewhere without TIM.
    • The anonymized agency data is updated with KIM to allow longitudinal use.
    • In the event of a data breach, there is no connection between separate agencies’ records, and agency-specific IDs cannot be joined without TIM.
    • Full extracts of multi-year state agency data are no longer needed because recent updates are the only extracts required from state agencies.
    • Long-term studies can be supported and privacy protected using the consistent research IDs, rather than re-matching personal information.