-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow_no_match doesn't mean allow_multiple_match #249
Comments
I'm not sure I understand the question? The commit in question doesn't
remove the ability to get errors- it differentiates between the 0 and many
case in the error message, which helps the developer figure out how to
resolve it.
…On Fri, Feb 3, 2017 at 10:26 PM, James McKinney ***@***.***> wrote:
@jamesturk <https://github.com/jamesturk>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#249 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAfYk5B86qcofxB0wyQhx8kW7bifNKjks5rY-_2gaJpZM4L3Bf6>
.
|
(The earlier behavior had a bug in that if allow_no_match was set to True,
it sometimes still raised errors if it couldn't match because of multiple
results)
…On Fri, Feb 3, 2017 at 10:33 PM, James Turk ***@***.***> wrote:
I'm not sure I understand the question? The commit in question doesn't
remove the ability to get errors- it differentiates between the 0 and many
case in the error message, which helps the developer figure out how to
resolve it.
On Fri, Feb 3, 2017 at 10:26 PM, James McKinney ***@***.***>
wrote:
> @jamesturk <https://github.com/jamesturk>
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#249 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAAfYk5B86qcofxB0wyQhx8kW7bifNKjks5rY-_2gaJpZM4L3Bf6>
> .
>
|
FWIW, the previous behavior seemed correct to me (if multiple matches,
raise error even if allow_no_match is True)
…On Fri, Feb 3, 2017 at 9:34 PM, James Turk ***@***.***> wrote:
(The earlier behavior had a bug in that if allow_no_match was set to True,
it sometimes still raised errors if it couldn't match because of multiple
results)
On Fri, Feb 3, 2017 at 10:33 PM, James Turk ***@***.***> wrote:
> I'm not sure I understand the question? The commit in question doesn't
> remove the ability to get errors- it differentiates between the 0 and
many
> case in the error message, which helps the developer figure out how to
> resolve it.
>
>
>
> On Fri, Feb 3, 2017 at 10:26 PM, James McKinney <
***@***.***>
> wrote:
>
>> @jamesturk <https://github.com/jamesturk>
>>
>> —
>> You are receiving this because you were mentioned.
>> Reply to this email directly, view it on GitHub
>> <#249#
issuecomment-277415156>,
>> or mute the thread
>> <https://github.com/notifications/unsubscribe-auth/
AAAfYk5B86qcofxB0wyQhx8kW7bifNKjks5rY-_2gaJpZM4L3Bf6>
>> .
>>
>
>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#249 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAgxbSyd6ntJCsFko2s7_mZKDClRyHNzks5rY_G0gaJpZM4L3Bf6>
.
--
773.888.2718
|
Hmm, I'm not sure I understand the use case? It is still unable to match, just for a different reason. |
I want semantics of `allow_no_match` to be "is okay if there is no object
that matches this signature". You changed the semantic to be "it is okay if
there is not a *single* object that matches this signature."
I often want the first, but I've never wanted the second. Why do you want
the second meaning?
…On Fri, Feb 3, 2017 at 9:40 PM, James Turk ***@***.***> wrote:
Hmm, I'm not sure I understand the use case? It is still unable to match,
just for a different reason.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#249 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAgxbeWrmuZmQb5RQtcug9kk5UyrESW5ks5rY_NEgaJpZM4L3Bf6>
.
--
773.888.2718
|
Same as @fgregg. I never want to allow multiple matches without a loud error.
That's not a bug. That's what 'allow no matches' means! |
So what should happen if an attempt is made to resolve a bill sponsor
'Smith' and there are two matches?
Crashing out the bill scraper b/c of an unresolved legislator is clearly not right. Which cases would you want to fail an unrelated scraper b/c of two ambiguous legislators?
…On Fri, Feb 3, 2017 at 10:53 PM, James McKinney ***@***.***> wrote:
Same as @fgregg <https://github.com/fgregg>. I never want to allow
multiple matches.
(The earlier behavior had a bug in that if allow_no_match was set to True,
it sometimes still raised errors if it couldn't match because of multiple
results)
That's not a bug. That's what 'allow no matches' means! allow_no_match
doesn't mean allow_zero_or_multiple_matches.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#249 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAfYlGjDRd1WxM9vZ4iTiYel69LDQ0Nks5rY_YrgaJpZM4L3Bf6>
.
|
I get that reading, but if you look at how it is used: allow_no_match -> allow this method to return None, not allow the DB to have no results |
I would like pupa to throw an error and grind to a halt. |
That'll need to be a setting, that absolutely won't work for the volume of ambiguous legislators, we have hundreds. |
I guess it should be named There seem to be two use cases. One where you disambiguate objects in code by adding birth dates to legislators, for example, and one where you disambiguate using the features in admin views. I have always disambiguated objects in code, but I have far fewer related objects requiring resolution. |
If you don't throw an error, you will have thousands of pieces of legislation where you don't know who sponsored it. |
You can't add a birth date to a sponsor, voter, etc. as all you'll have is their last name. It is completely OK w/in the db schema for there to be sponsors, etc. with no leg_id set, that was an intended feature from the get-go (to not force perfect resolution if there's ambiguity), which is why this behavior was considered broken. |
Very true, but that ambiguity is often true upstream as well. Plus, we're already allowing None to be returned if it wasn't a recognized person. Consider these cases:
I'm pretty sure that all three should be treated the same, that is- record the textual name and fail to resolve the person_id. It doesn't seem like the first is any different from the other two. |
Ah, true, we only I think I was conflating the resolution here with other DB resolution ( |
Leaving open in case @fgregg has use case for old behavior. |
So, to me, there are two main differences. Interaction between types of bugs and under and overmatchingBugs in scraper code and bugs on websites can both lead to under and overmatching. In my experience, overmatching is likely a bug in the code and undermatching is likely a bug in the website. Correct behavior sometimes depends on allowing zero matchesI just wrote some code to allow for the resolution of posts that do not have district identifiers within memberships. These types of memberships have the same signature as memberships without posts. In order to allow for both "post-less" memberships and memberships associated with posts without districts, I'm taking advantage of the no matching behavior. In this case, I am not making concessions to a messy reality. I've probably overloaded the intent of allow_no_match to allow for accurate representation. I have no case, and can't think of a case, where the correct representation would depend on |
It sounds like there may need to be a second option added that certain
resolvers use, but we should take care to ensure that the people resolvers
called within bills, etc. continue to support the above. A warning is in
order, but a complete failure would give us no way to save bills when the
overmatch issue is upstream.
…On Fri, Feb 3, 2017 at 11:22 PM, Forest Gregg ***@***.***> wrote:
@jamesturk <https://github.com/jamesturk>
So, to me, there are two main differences.
Interaction between types of bugs and under and overmatching
Bugs in scraper code and bugs on websites can both lead to under and
overmatching. In my experience, overmatching is likely a bug in the *code*
and undermatching is likely a bug in the *website*.
Correct behavior sometimes depends on allowing zero matches
I just wrote some code to allow for the resolution of posts that do not
have district identifiers within memberships. These types of memberships
have the same signature as memberships without posts. In order to allow for
both "post-less" memberships and memberships associated with posts without
districts, I'm taking advantage of the no matching behavior.
In this case, I am not making concessions to a messy reality. I've
probably overloaded the intent of allow_no_match to allow for accurate
representation. I have no case, and can't think of a case, where the
correct representation would depend on allow_many_matches.
(btw, I don't think this is the best approach to district-less posts, but
it was a move I could make without a change to the Popolo standard)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#249 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAfYsE4b-2AazW6WcaZlhT1GLwej1fPks5rY_0FgaJpZM4L3Bf6>
.
|
Why was this change made? 0c2cb44
If you don't want to raise any errors at all, add a parameter like
dont_raise_errors
or something. I depend on multiple matches raising errors.allow_no_match
is internal to Pupa - I can't set it in my code. The cases whereallow_no_match=True
make sense to allow no matches. It doesn't make sense for all those cases to also mean 'allow multiple matches'.The text was updated successfully, but these errors were encountered: