Lesson 7: How to Aggregate Network Flows Into Application Flows

Prof Wool: Network Segmentation | Lesson 7: How to Aggregate Network Flows Into Application Flows Subscribe

Learn more about AlgoSec at http://www.algosec.com and read Professor Wool's blog posts at http://blog.algosec.com

View transcript

hello I'm professor world today we'll be discussing how to aggregate network flows into application flows so if you remember from previous lessons we talked about learning or discovering application flows from network traffic either by deploying a sniffer and capturing full pcap files or by using a net flow source and processing that today I want to look a little deeper into this processing part to see what sort of considerations go into doing that well so our starting point is let's say a net flow capture so a list of Records source destination and service which we want to represent as flows supporting some application if we look at this an easy operation to make is that you can see a lot of repetition in the service and also here in this example all the destinations are the same it's the one IP address of all these destinations and there are multiple sources so the easiest thing to do is to take this representation and convert it to something that looks like what we have here in option number one at the top where the service is listed once and the destination is listed once just IP address and in the source we see the list of all the IP addresses that were discovered in in the net flow sorted by IP address this can can be done quite easily however I argued that this is not very satisfactory for a few reasons first of all it's too detailed there could be here we have only six but it could be hundreds or thousands of separate IP addresses appear in the source and this is two too long and too detailed for a person to look at and understand it's also very accurate it's too accurate it records exactly the IP addresses that were seen in the network capture but it's not future-proof I mean if you look at this list and you can see that IP address dot seven and dot 8 and dot 13 connected to this destination it's quite plausible that IP addresses dot 9 10 11 and dot 12 would also connect to the same web server at some point in the future they just didn't do so while we were capturing the traffic and if we restrict ourselves to only the IP addresses that we observed we get something that is very accurate but not future proof so this might not be the best way to represent the flows that we observed an alternative is what we have here in number 2 instead of having the individual IP addresses listed in the source we just have the source of any and then the destination would be that web server and the service was HTTP so in terms of usability this is great this is very compact just a single record showing what might happen capturing or describing all the flows that were actually observed and it's completely future proof every possible IP address could appear in the source and the flow still describes it so this is good the downside is of course that is too broad this is very very inaccurate it's not that we saw every single IP address in the Internet and the IP addresses we did see are not uniformly distributed they are quite focused there are you can see that the 10.2 subnet is is quite visible and then we have this outlier in the 3.7 IP address but it's still pretty focused so we could do better than this as well and trying to strike the balance is what I have here in option number 3 where you can see that well the destination and service are as before but in terms of the source we have the 3.7 1.7 IP address appears separately as a slash 32 cyber block and the five other IP addresses appear as 10.2 1.0 / 24 which is wider than what was strictly observed this allows or describes 256 possible IP addresses but they're all grouped together in one subnet so it is future-proof up to a point so all the IP addresses in the 10.2 subnet are described by this flow but it's not too accurate and it's not as as detailed as we had up here and it's it's reasonably accurate and it's quite usable because it's still very compact so it is possible to find this middle ground algorithmically trying to balance the accuracy against the usability and to produce a compact representation that's still reasonable and usable thank you for your attention

Videos

Prof Wool: Network Segmentation | Lesson 7: How to Aggregate Network Flows Into Application Flows Subscribe

Related videos

Lesson 3: Common Mistakes and Best Practices for...

Lesson 5: The Challenges of East West Traffic Discovery...

Lesson 4: Data Center Segmentation Best Practices

Lesson 6: How to Build Firewall Policies for East West...