"No EAP session matching state" - logging

classic Classic list List threaded Threaded
2 messages Options
| Threaded
Open this post in threaded view
|

"No EAP session matching state" - logging

Bob Franklin
Hello,

Please excuse I haven't got a "radiusd -X" debugging output for this -- trying to catch one is quite difficult, given the intermittent nature of the problem, the difficulty matching a problematic supplicant and the volume of traffic I'm sifting through, as well as the issue being on a NAS that's behind a downstream proxy (but I'm responsible for the home server[*]) and with a supplicant I don't have access to, so I thought I'd start with general principles first, just to check my understanding is correct...

I am investigating a situation where we're getting an Access-Reject reply being sent, with a Module-Failure-Message of "No EAP session matching state ...".  The previous connection successful attempt was several minutes before and the supplicant repeatedly tries for a minute, with errors showing this state number, then I get messages with another state number, etc.  Eventually it successfully connects, so I get streams such as:

2020-10-13 15:44:41 [various challenges] + Access-Accept

2020-10-13 15:47:41 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:47:43 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:47:45 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:47:47 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:47:49 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:47:51 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:47:53 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:47:55 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:47:57 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:47:59 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
2020-10-13 15:48:01 Access-Reject: No EAP session matching state 0xc600f921c5e0e014

2020-10-13 15:48:35 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:37 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:39 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:41 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:43 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:45 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:47 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:49 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:51 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:53 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c
2020-10-13 15:48:55 Access-Reject: No EAP session matching state 0x3c4d84e73c959d9c

2020-10-13 15:54:57 Access-Reject: No EAP session matching state 0x5ea230995b702974
2020-10-13 15:54:59 Access-Reject: No EAP session matching state 0x5ea230995b702974
2020-10-13 15:55:01 Access-Reject: No EAP session matching state 0x5ea230995b702974
2020-10-13 15:55:03 Access-Reject: No EAP session matching state 0x5ea230995b702974
2020-10-13 15:55:05 Access-Reject: No EAP session matching state 0x5ea230995b702974

2020-10-13 15:56:13 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:15 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:17 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:19 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:21 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:23 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:25 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:27 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:29 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:31 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:33 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098
2020-10-13 15:56:35 Access-Reject: No EAP session matching state 0x5d35a9ea5f31b098

2020-10-13 15:56:40 [various challenges] + Accept-Accept

Is there anything I can infer from this, or would I need a full debug log for an affected supplicant?  I assume these sessions either never existed (and the supplicant is just plain broken) or they were previous sessions that have been cleaned up (which I believe happens after 5 seconds - these go on for longer).

Does getting these these new state numbers off the bat (i.e. with no previous EAP messages from the supplicant) imply the supplicant is doing something odd and continuing EAP conversations without me having seen the start of them?  I linelog a message every time I receive a packet (so a full authentication stream has several lines) and have nothing for this supplicant aside from the above messages.

Is there any way I can log the 'EAP session state' field in my linelog from previous messages so I can see if they're related to ones I've seen before?

One thing that does look odd is that, from this particular downstream RADIUS proxy, we are seeing a lot of reauthentications (albeit mostly successful): I would normally only expect to see supplicants reauthenticating occasionally, but I'm often seeing things reauthenticate every 2-3 minutes, which looks a bit suspiciously like there are local problems at the visited site.


Apologies if there's nothing you can really do to help me without the full debug log and, if I need it, I'll try and get in touch with an affected user and see if we can get it to fail - let me know and I'll come back when I have it.  Because problems often only get reported after the event, I try to get as much as possible logged that I can look back through easily (a full debug log probably being too much), if someone reports something.

  - Bob


* this is eduroam


--
Robert Franklin <[hidden email]> / (+44 1223 7) 48479
Network Systems, University Information Services, University of Cambridge


-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
| Threaded
Open this post in threaded view
|

Re: "No EAP session matching state" - logging

Alan DeKok-2
On Oct 13, 2020, at 12:16 PM, Robert Franklin <[hidden email]> wrote:
> Please excuse I haven't got a "radiusd -X" debugging output for this -- trying to catch one is quite difficult, given the intermittent nature of the problem, the difficulty matching a problematic supplicant and the volume of traffic I'm sifting through, as well as the issue being on a NAS that's behind a downstream proxy (but I'm responsible for the home server[*]) and with a supplicant I don't have access to, so I thought I'd start with general principles first, just to check my understanding is correct...

  Sometimes it's fine to not have debugging issues.

> I am investigating a situation where we're getting an Access-Reject reply being sent, with a Module-Failure-Message of "No EAP session matching state ...".  The previous connection successful attempt was several minutes before and the supplicant repeatedly tries for a minute, with errors showing this state number, then I get messages with another state number, etc.  Eventually it successfully connects, so I get streams such as:
>
> 2020-10-13 15:44:41 [various challenges] + Access-Accept
>
> 2020-10-13 15:47:41 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
> 2020-10-13 15:47:43 Access-Reject: No EAP session matching state 0xc600f921c5e0e014

  Something is doing retransmits *after* the previous packets have been replied to, or have expired.  This is bad.

  It's generally the supplicant, or maybe the NAS.  And there's little you can do to "fix" FreeRADIUS when something *else* is broken.

  Note also that the state is 16 bytes, and not all of it is being printed.  So the "No EAP session" messages may be for multiple packets.

  From my magic 8-ball, I can tell that the State attributes are for the first or second packets in an EAP session.  So the session isn't really getting established.

> Is there anything I can infer from this, or would I need a full debug log for an affected supplicant?  I assume these sessions either never existed (and the supplicant is just plain broken) or they were previous sessions that have been cleaned up (which I believe happens after 5 seconds - these go on for longer).

  Pretty much.  The server is replying, and the reply never makes it to the supplicant.  So the supplicant retransmits, going "hey, what's up?".  The NAS sees that, re-uses the old State... and the server goes "no, I replied already".

> Does getting these these new state numbers off the bat (i.e. with no previous EAP messages from the supplicant) imply the supplicant is doing something odd and continuing EAP conversations without me having seen the start of them?  I linelog a message every time I receive a packet (so a full authentication stream has several lines) and have nothing for this supplicant aside from the above messages.=

  Each packet containing EAP has a State attribute.  The only ones missing it are the first one, and the last one.

> Is there any way I can log the 'EAP session state' field in my linelog from previous messages so I can see if they're related to ones I've seen before?

  The State attribute is largely random garbage.  There isn't much point in logging that.

> One thing that does look odd is that, from this particular downstream RADIUS proxy, we are seeing a lot of reauthentications (albeit mostly successful): I would normally only expect to see supplicants reauthenticating occasionally, but I'm often seeing things reauthenticate every 2-3 minutes, which looks a bit suspiciously like there are local problems at the visited site.

  Pretty much.  If they have radio layer issues, or AP issues, supplicants will retry a LOT.

  Call them and tell them that their equipment is crap. :(

> Apologies if there's nothing you can really do to help me without the full debug log and, if I need it, I'll try and get in touch with an affected user and see if we can get it to fail - let me know and I'll come back when I have it.  Because problems often only get reported after the event, I try to get as much as possible logged that I can look back through easily (a full debug log probably being too much), if someone reports something.

  Debug logs won't really help here.  Something in the network is broken, and it isn't FreeRADIUS.

  Alan DeKok.


-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html