What is the hash of a null-type attribute value?

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

What is the hash of a null-type attribute value?

Rod Widdowson
Probably Tom and/or Scott:

I was going to put this into JIRA but a discussion here will be easier.  

I discovered while de-guava-ing the consent modules that net.shibboleth.idp.consent.logic.impl.AttributeValuesHashFunction was
annotated (and coded) to accept a @NullableElements Collection<IdPAttributeValue> parameter

The annotation is clearly incorrect since we haven't support null as an attribute value ever and explicitly since the early V3s.
Changing the code then threw a test failure against exactly this, which again is not unusual.

But then I got smart and decided to change the test to insert an EmptyAttributeValue.NULL element and it still failed.  Which is
also fine.

But I got to wondering what the correct behaviour of this hashing function should be in the presence of the empty attributes.   And
the answer is that I just don't know.

What the code currently does is accumulate all the values into a ByteArrayOutputStream wrapped inside an ObjectOutputStream, with
Scoped, String and XML attribute values all being serialized by special purpose code.  Everything else then falls into
value.getNativeValue().   When the accumulation is complete the result is base 64 converted.

I don't know what the hash is used for but AFAICS we end up with something which may be JVM (and possible instance of JVM) specific.
On my VM it appears to be some sort of line noise followed by the enum value
(net.shibboleth.idp.attribute.EmptyAttributeValue$NullAttribute) but I'm assuming that that isn't specified anywhere.  So it strikes
me that we need a special case EmptyAttributeValues since this is probably an "On Disk Format" and so unchanging, but I am not
really sure what it should be.

Then there is the philosophical question "What is the difference between a value that isn't there and one that is explicitly not
there".  And the pragmatic one that this function will never see such values.

For now I'll supress the test and move on.

        /Rod

--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Cantor, Scott E.
The hashes are used to detect changes to the values when it's confgured to maintain previous decisions in cases where the data doesn't change. As long as the serialization is consistent it should be fine, so anything reasonable should work.

-- Scott


--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Tom Zeller-3
> The hashes are used to detect changes to the values

Rod,

Looks like you changed the hashes of empty attribute values, are we
sure we want to do that ?

If a deployer of consent is comparing attribute values
(idp.consent.compareValues=true), then users with empty attribute
values will be prompted for consent (because the hash of an empty
attribute value has changed).

Thanks for catching this.

Tom
--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Cantor, Scott E.
On 11/6/19, 3:28 PM, "dev on behalf of Tom Zeller" <[hidden email] on behalf of [hidden email]> wrote:

> Looks like you changed the hashes of empty attribute values, are we
> sure we want to do that ?

I think the supposition was that they're already not stable hashes, but would be JVM specific. So we probably need at least a one-time change to them to make them stable values.

-- Scott


--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Tom Zeller-3
> I think the supposition was that they're already not stable hashes, but would be JVM specific.

FWIW Before the change, the hashes appeared to be stable on the
handful of Java 11 JVMs I have access to. Of course, YMMV.

Not arguing with the change to be more deterministic.

Tom
--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Cantor, Scott E.
On 11/6/19, 5:06 PM, "dev on behalf of Tom Zeller" <[hidden email] on behalf of [hidden email]> wrote:

> FWIW Before the change, the hashes appeared to be stable on the
> handful of Java 11 JVMs I have access to. Of course, YMMV.

I didn't really look at the underlying issue. If it was stable but not guaranteed to be it's still probably worth taking the hit, particularly given the rarity of hitting empty values in practice.

-- Scott


--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

RE: What is the hash of a null-type attribute value?

Rod Widdowson
Tom,  Thanks for spotting this.

Just to close the loop on this.  I'd concur with all Scott's points.

This is our only chance to fix this in until V5 and if we absolutely had to I'd happily break all consent in a V3 to V4.  Obviously
I'd prefer not to gratuitously break an on disk structure and we don't have such a need.

On reflection, even if it could be demonstrated that what is written to an object stream is formally specified I'd want to make this
change since the existing code is fragile in the face of (API) changes in the future.

I would be very surprised if null values actually every make it to attribute consent.  Indeed there is an argument that since (at
least) the SAML encoder drop null values that they should be serialized at all (one I don't buy FWIW).

        /Rod

--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Tom Zeller-3
> On reflection, even if it could be demonstrated that what is written to an object stream is formally specified I'd want to make this
> change since the existing code is fragile in the face of (API) changes in the future.

If we want to write a String instead of an Enum to avoid Java API
changes, how about
 objectOutputStream.writeObject(EmptyType.NULL_VALUE.toString());
instead of
 objectOutputStream.writeObject("NULLVALUE");
?

Or, just
 objectOutputStream.writeObject(value.getNativeValue().toString())
should work too, I would think.

Tom
--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Rod Widdowson
But the point is that enum.toString() is not defined (AFAIK) and even if it is, it contains the full class name - just like the raw thing currently does. That leads to fragility at an API change...

By absolutely interpreting the value type in the same way we do for the other types we keep control of everything....

Sent from my iPad

> On 7 Nov 2019, at 18:37, Tom Zeller <[hidden email]> wrote:
>
> 
>>
>> On reflection, even if it could be demonstrated that what is written to an object stream is formally specified I'd want to make this
>> change since the existing code is fragile in the face of (API) changes in the future.
>
> If we want to write a String instead of an Enum to avoid Java API
> changes, how about
> objectOutputStream.writeObject(EmptyType.NULL_VALUE.toString());
> instead of
> objectOutputStream.writeObject("NULLVALUE");
> ?
>
> Or, just
> objectOutputStream.writeObject(value.getNativeValue().toString())
> should work too, I would think.
>
> Tom
> --
> To unsubscribe from this list send an email to [hidden email]

--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Cantor, Scott E.
On 11/7/19, 2:15 PM, "dev on behalf of Rod Widdowson" <[hidden email] on behalf of [hidden email]> wrote:

> But the point is that enum.toString() is not defined (AFAIK) and even if it is, it contains the full class name - just like the
> raw thing currently does. That leads to fragility at an API change...

Obviously it doesn't have to contain the raw class name since we can implement toString ourselves, but I think it's generally understood that toString shouldn't be used for much besides object dumping/debugging, and explicit methods ought to be used for anything meaningful.

-- Scott


--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Tom Zeller-3
In reply to this post by Rod Widdowson
> But the point is that enum.toString() is not defined (AFAIK) and even if it is, it contains the full class name - just like the raw thing currently does. That leads to fragility at an API change...

No, I don't think so. Enum.toString() and Enum.name() return the same
name string (in our case NULL_VALUE or ZERO_LENGTH_VALUE) unless
toString() is overridden. It's ObjectOutputStream.writeObject() which
ends up calling the private writeEnum() which prefixes the name with
the class name.

Tom
--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What is the hash of a null-type attribute value?

Tom Zeller-3
In reply to this post by Cantor, Scott E.
>  but I think it's generally understood that toString shouldn't be used for much besides object dumping/debugging, and explicit methods ought to be used for anything meaningful.

Agree. But ... except for Enum, which says toString() is preferred over name().

 https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Enum.html

Tom
--
To unsubscribe from this list send an email to [hidden email]
Reply | Threaded
Open this post in threaded view
|

RE: What is the hash of a null-type attribute value?

Rod Widdowson
In reply to this post by Tom Zeller-3
> No, I don't think so. Enum.toString() and Enum.name() return the same
> name string (in our case NULL_VALUE or ZERO_LENGTH_VALUE) unless
> toString() is overridden. It's ObjectOutputStream.writeObject() which
> ends up calling the private writeEnum() which prefixes the name with
> the class name.

That works for me.  I'll make an appropriate change today.  Of course the real answer (which I should have bitten off when we
refactored the IdPAttributeValue) is to move that function into the attributeValue, that way we would never have to have this
conversation again (you'll understand that my real worry is the separation between the two implementations and that a change over in
attribute-api could unwittingly break this code).

But it's too late to do that now and I wouldn't imagine that there will be much movement in adding new attribute value types (I
checked the OIDC plugin doesn't have one).

Thanks Tom

        /Rod

--
To unsubscribe from this list send an email to [hidden email]