Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPLAC with PERIODS #552

Draft
wants to merge 13 commits into
base: splac-beside-plac
Choose a base branch
from

Conversation

mother10
Copy link

This is a draft pullrequest, an addition to "SPLAC besides PLAC".

It adds a PERIOD structure to SPLAC to give it timeframe's to keep the SPLAC data of 1 timeframe together.
It is an attempt to deal with the problems mentioned in the Zoom session of september 10, 2024.

It also tries to add the GEDCOM-L's _LOC extension to see how that might look.

An extra Example page is added , to be able to see it "working". That page has worked examples of the problems that were mentioned in the Zoom session.

Hopefully this will bring others on more, and maybe better ideas too!

Addition to SPLAC, following the zoom session of sept 10, 2024.
Attempt to solve the problems mentioned there.
New File.
Addition to SPLAC, following the zoom session of sept 10, 2024.
Attempt to solve the problems mentioned there.

File data-types
Addition to SPLAC, following the zoom session of sept 10, 2024.
Attempt to solve the problems mentioned there.
Addition to SPLAC, following the zoom session of sept 10, 2024.
Attempt to solve the problems mentioned there.
Addition to SPLAC, following the zoom session of sept 10, 2024.
Attempt to solve the problems mentioned there.
Addition to SPLAC, following the zoom session of sept 10, 2024.
Attempt to solve the problems mentioned there.

Examples belonging to the change.
Added reference to Period Structure. Explanation about adding old info is moved there.
There were some inconsistencies in the SPLAC links.
@mother10
Copy link
Author

mother10 commented Oct 1, 2024

Here some background information about this PR.

The Header: changing PLAC from{0:1} to {1:1}

Thats because I worked with some well known, widely spread Dutch program, which had no PLAC.FORM in the header, and no PLACfor any place used in the GEDCOM. Trying to import that into another program gave lots of problems. As these programs could not automatically locate the places on a map.
So in my opinion it is important that it is obligatory for a GEDCOM to have a way of defining how the places in that GEDCOM are specified.
Whatever that way of defining might be.
In fact, I would very much like to see many of the HEAD tags be {1:1}, for the same reason:
Specify where this GEDCOM comes from, who it belonged too, what it is about.
But I didnot do that in this PR.

PLAC.FORM list

This list is just a start. It could be any list, only not too many entries.
It is put up there so every program can Always see what "Main" "jurisdictions" are used.
Thats why it is {1:1} in this PR.
It makes sure tha list is always available from this GEDCOM version onward.
The GOV number is added to further specify a certain entry from this list, as the GOV list has a huge amount of "jurisdictions" defined.
That way software has the possibility to redefine "jurisdictions" found in older GEDCOM's, into the system that will be used from this version of GEDCOM.

Now it could be the tagname PLAC in the header should be redefined, as as far as I understood, PLAC will be depricated in future??

SPLAC construction with TYPE having a list of parameters.

I struggled a bit with this. In my PR this construction now looks like:

  +1 TYPE <Text>                           {1:1}  g71:SPLAC-TYPE 
    +2 HREL <HIERARCHICAL_RELATIONSHIP>    {1:1}  g71:SPLAC-HREL
      +3 GOVTYP  <GOVID_OF_TYPE>           {0:1}
        +4 TEXT                            {0:1}  g71:SPLAC-TEXT  

This gives for example:

1 TYPE COUNTRY, POLI, 7, Federal State
It was done this way because we now have two times {1:1} following eachother, so "empty" commas would be at the end of the line.

I also tried:

1 TYPE COUNTRY, 7, POLI, Federal State

The GEDCOM there would have been:

  +1 TYPE <Text>                           {1:1}  g71:SPLAC-TYPE 
    +2 GOVTYP  <GOVID_OF_TYPE>             {0:1}
      +3 HREL <HIERARCHICAL_RELATIONSHIP>  {1:1}  g71:SPLAC-HREL
        +4 TEXT                            {0:1}  g71:SPLAC-TEXT  

That could give empty commas in the middle of the line.
And from my experience with one program I use, I know people sometimes have real difficulties with reading empty commas.
We could overcome that in the above second version (not in the PR yet) by defining it like this:

  +1 TYPE <Text>                           {1:1}  g71:SPLAC-TYPE 
    +2 GOVTYP  <GOVID_OF_TYPE>             {1:1}
      +3 HREL <HIERARCHICAL_RELATIONSHIP>  {1:1}  g71:SPLAC-HREL
        +4 TEXT                            {0:1}  g71:SPLAC-TEXT  

So make GOVTYP obligatory with {1:1}
But that means there should be a GOVTYP for "Unknown" or "Not Present", which could be denoted by the value 0.
But sofar I could not find something like that in the GOVlist.

PERIOD_STRUCTURE

This structure was named PERIOD_STRUCTURE but that could always be changed into another name, like TFRAME_STRUCTURE or something. (TimeFrame)

The original description of how to convert old PLAC values into new SPLAC values, was moved to the PERIOD_STRUCTURE.
But the examples there not yet converted to the new SPLAC addition in this PR. As I have no idea if this PR fills the needs. If it does somehow, the examples must be changed too.

HREL

The fact that maybe HREL could also be written as

  HREL POLI / RELI

so with 2 values separated by a slash is mentioned in an example but as it is fantasized, it is not defined anywhere yet.

SUBM

Here SUBM was added on 1 place in this PR, but as I saw posts with a proposel to add SUBM on more places in GEDCOM, that could be done here too.

@@ -30,6 +30,18 @@ Negative integers are not supported by this specification.

The URI for the `Integer` data type is `xsd:nonNegativeInteger`.

## Decimal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "Float" would be a better name for this proposed type

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dave, first thanks for the comments!

About Float, yes, I thought about that. But here this type is only meant to hold the lenght/distance, not the result of a very complicated calculation, so thats why I took Decimal. Should I change that into Float?

@@ -153,7 +155,7 @@ n HEAD {1:1} g7:HEAD
+1 SUBM @<XREF:SUBM>@ {0:1} g7:SUBM
+1 COPR <Text> {0:1} g7:COPR
+1 LANG <Language> {0:1} g7:HEAD-LANG
+1 PLAC {0:1} g7:HEAD-PLAC
+1 PLAC {1:1} g7:HEAD-PLAC
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion in GEDCOM Steering Committee:

  • Requiring this would not allowing supporting cases where one doesn't know the names of the jurisdiction labels, such as "Champaign" where the source record doesn't specify whether it's Champaign the city in Illinois or the county in Illinois.
  • The committee is considering the possibility of deprecating the use of HEAD.PLAC.FORM all together, but wants more input on its use.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the programs I work with, when creating a new GEDCOM, it outputs the "default" jurisdictions there, depending on the language of the GEDCOM. So for French it outputs "Lieudit, Commune, Code_INSEE, Code_Postal, Département, Région, Pays", for Dutch it outputs "Gehucht, Plaats, Postcode, Provincie, Staat, Land" (So French or Dutch jurisdictions, which, in my opinion, is wrong, and should have been english in all cases, as other programs might not understand jurisdictions in all languages)

That also happens when it imports a GEDCOM that has no jurisdictions in the HEAD.
So it IS used.

But as it is mandatory not all programs have that in the HEAD, causing lots of trouble when another program tries to interpret places entered in all kind of forms, so in an unstructured way.
It often means, that after importing their file in another program, users have to go over their places by hand to be able to get the correct map coordinates.
Thats the reason I made it {1:1}
Coming from a widely used Dutch program that has nothing at all for PLAC.FORM, not in their HEAD, nor in the file itself I have seen many users struggling with that when they want to import somewhere else.

I always interpreted HEAD.PLAC.FORM as the base rule for jurisdictions. The default jurisdictions so to speak.
Defining a default set of jurisdictions, isnt that separate of how a certain place inside a GEDCOM is defined?

As you say about "Champaign" a user might not know exactly what it is. But that is only one place somewhere in the GEDCOM. Most other places might easily fit in the default jurisdictions.
In case of SPLAC, Champaign would only have the Champaign SPLAC and no further link to a parent SPLAC, so i think that would still be a problem then.

So if those "defaults" would be deprecated, how to replace them?
Online systems, often used by genealogical programs to grab the coords for a place, use a certain amount of "jurisdictions" to be able to find that place in their database. Just the placename is often not sufficient.

These were a couple of my thoughts when I changed this to {1:1}
Do you rather want me to change it back?

- `PLAC` now is {1:1} so obligatory.
- `FORM` now has `g7:HEAD-PLAC-FORM71` which is defined as follows:
The `<List:Text> ` always consists of the following string of jurisdictions (smallest to largest:
**`LOCATION, ZIPCODE, VILLAGE, CITY, CODEINSEE, DISTRICT, PROVINCE, COUNTY, STATE, COUNTRY, SEA, EARTH`**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion in GEDCOM Steering Committee:

  • Many locations don't have all (which the PR text mentions), and some have such jurisdictions in a different order, such as where a county is smaller than a city (such as Queens, New York), or a zip code covers multiple villages (e.g., townships in the U.S.), so the "smallest to largest" would contradict forcing those strings.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok You are right, how about:

The <List:Text> contains the possible defaults for the jurisdictions in this GEDCOM file. Not all GEDCOM files will have each of those defaults. Some will have less. The default jurisdictions are / can be:

LOCATION, ZIPCODE, VILLAGE, CITY, CODEINSEE, DISTRICT, PROVINCE, COUNTY, STATE, COUNTRY, SEA, EARTH
Their sequence (small - large) and number, might differ per country, as for instance for Queens, New York, that has a county smaller than a city. And the jurisdiction ZIPCODE, is sometimes smaller than a city and sometimes it is larger.

- `FORM` now has `g7:HEAD-PLAC-FORM71` which is defined as follows:
The `<List:Text> ` always consists of the following string of jurisdictions (smallest to largest:
**`LOCATION, ZIPCODE, VILLAGE, CITY, CODEINSEE, DISTRICT, PROVINCE, COUNTY, STATE, COUNTRY, SEA, EARTH`**
These are in fact the original jurisdictions from GEDCOM7 with a few added. With this list, older GEDCOM files can be converted, by comparing their `PLAC.FORM's` with this list.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no "original jurisdictions in GEDCOM 7" per se. There are two examples, but those are merely examples, not normative.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this becomes:

These are in fact jurisdictions found in current GEDCOM files, with a few added. With this list, older GEDCOM files can be converted, by comparing their PLAC.FORM's with this list.

<<PLACE_DETAILS>> {1:1}
<<SHARED_PLACE_STRUCTURE>> {0:M} g71:SPLAC
+1 TYPE <Text> {1:1} g71:SPLAC-TYPE
+2 HREL <HIERARCHICAL_RELATIONSHIP> {1:1} g71:SPLAC-HREL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _LOC extension has {0:M} occurrences of hierarchical relationship (i.e., one per pointer that can be 0:M), to allow for multiple types of nesting, including ecclesiastical, geographical, governmental, historical, etc.

Change HREL from {1:1} to {0:M}
<<PLACE_DETAILS>> {1:1}
<<SHARED_PLACE_STRUCTURE>> {0:M} g71:SPLAC
+1 TYPE <Text> {1:1} g71:SPLAC-TYPE
+2 HREL <HIERARCHICAL_RELATIONSHIP> {0:M} g71:SPLAC-HREL
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to {0:M} but now the original comment of Dave is gone?

So I probably did it the wrong way.
Can anyone explain how to do it the proper way, so the original comments stay?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants