Firewall Wizards mailing list archives

Re: "Who else picked this one up?"


From: "Marcus J. Ranum" <mjr () nfr net>
Date: Fri, 30 Apr 1999 20:52:17 -0400

Paul Robertson writes:
A hashed IP address isn't going to be really useful as a cover if it's 
easily recreated, and not so useful as a tool if it isn't.  I'd rather 
see heavy disclaimers that packets may be spoofed and real addresses.

True. This is a Hard Problem(tm) - I was toying with 3 choices:
        1) Send up hashed addresses
        2) Send up keyed hashed addresses
        3) Send up actual addresses

Hashed addresses has the advantage that we're not publishing a
"black list" of addresses. It has the disadvantage that someone
can pretty easily brute force the hashes.

Using keyed hashed addresses has the advantage that only the
person who submits the address can verify that it matches
previous/other entries. So groups of network managers who are
cooperating could share the keys and generate useful information
without sharing it. It has the disadvantage that correlation
across addresses would then be impractical/useless.

Using the actual addresses has the advantage of simplicity
and functionality. It has the disadvantage of becoming a
potential "black list" and/or legal/privacy problem. I agree
with Paul that putting heavy disclaimers around the
database would be a sensible precaution, but I don't want to
trust people's ability to read disclaimers.

The important issue IMO is in the reporter's validity.  That's a tougher 
nut to crack, but should probably be a longer-term goal.  Victim data is 
going to be more difficult to get from everyone than attacker data.  

I figured that the way to crack the reporter validity problem is
to rate reporters to different degrees of trustworthiness. We'd
support an authenticated/encrypted record submission approach
as well as an anonymous one. Authenticated records would be
rated as "more reliable" and some of them (based on knowledge
of the individuals or organizations involved) might get additional
ratings. I'm still at the "scratching my head" phase of this
part of the problem. :) But I think it might work. If the script
kiddies decided to "pack" the database it'd be easy enough to
simply "reduce" the database to tiers of "trustworthiness" and
treat that data as "FYI" instead of fact. But that raises the
problem of disclaimers and getting people to read them.

How do you envision using the data, and how much of it (if any) should be 
blind analysis?

Well, that's the _really_ interesting question!!!
What could this data be used for?

First off, it'd be the first attempt I know of to quantify
the level and rate with which corporate and personal sites/systems
are scanned by "vulnerability assessment" tools (hacker tools)
or "illustrate problems with windows security" tools (hacker tools).
The information there could make for some interesting studies.
Are web sites (like where you work, Paul...?) scanned more often
than personal systems? How often are cable modem pools scanned,
versus dialups? Are the scans being conducted from within the USA
or outside? What countries are most popular? Are the time-of-day
patterns or is the activity constant? Does scanning activity
fluctuate with school schedules/college schedules? What percentage
of scans appear from ISPS? What percentage of scans appear from
corporate sites?

It might be very interesting to see what ISPs are the main sources
of "incidents" and forward the information to them. Perhaps
automatically.:)

It might also make a worthwhile data set for folks to use for
calibrating anomaly detection systems. I suspect there are
researchers who could have fun with such a database.

It might make some useful data for getting corporate management
and ISP management and maybe even Feds to realize that, yes,
Dorothy, there is a problem. It might make useful data for
convincing people that dial-up is not secure. It might make
useful data for convincing cable service providers to think
about designing their crap better.

It might make beautiful artwork, if you rendered a 5-hour
segment of the probes originating from a specific ISP on top
of Cheswick's Internet Maps. :)

It might serve as "probable cause" for cops to bust script
kiddies who spend all day scanning networks for Back Orifice.
BackOfficer Friendly has already scored a few hacker-kills
this way, according to a few cops I gave early versions to.

Anyone got thoughts they'd like to share about some of the
information that might be worth gathering? We thought we'd

Originating AS of the apparent source of the packets.  It's time to start 
dragging providers into the mess in some tangenital way.  If there are 
highly abusive networks, then that issue needs to be raised with those 
network operators.

Yep. It'd be _tempting_ to black hole them but I don't think
it's time (yet) for Internet vigilatism. _Yet_. We can't, _yet_
because we don't have good enough data to justify vigilantism.

Time both local and zulu (GMT) would also be good for overall trending.  

Good point. I figured we'd have the client submit its idea of the
time (and timezone) when it uploads records. Then we could use
patented subtraction technology to adjust the times.

I also was thinking that the source of the data records would
be optional. A site uploading records could upload them with
its "tagged" addresses, or keyed hashes of the tagged addresses.
That wouldn't show up in the database for query, it'd just be
recorded by unique site ID.

Still "scratching my head" over this stuff. Ches wanted to do
this kind of thing back in (hmm, 1992, I think it was) but
we never got around to it. He wanted to share syslogs. I never
figured out what I'd do with the information. Now that I'm
working on intrusion detection I'm starting to get ideas of
what to do with the information. :)

mjr.
--
Marcus J. Ranum, CEO, Network Flight Recorder, Inc.
work - http://www.nfr.net
home - http://www.clark.net/pub/mjr



Current thread: