nanog mailing list archives

traceroute with load?


From: Dave Taht <dave.taht () gmail com>
Date: Thu, 25 Nov 2021 08:57:06 -0800

Historically the bufferbloat effort has used irtt, ping, mtr in combination
with a set of tcp flows to attempt to induce and graph the problem via the
flent tool. I haven't thought all that much about ecmp or isolating the
bloated hop until recently as an outgrowth of apple's networkQuality effort
here:

https://raw.githubusercontent.com/network-quality/draft-cpaasch-ippm-responsiveness/master/draft-cpaasch-ippm-responsiveness.txt

TCP_INFO, at least in linux, has now accumulated an amazing number of
useful looking statistics few are using as yet. monitoring hopcount also,
and perhaps changes to the flowid in transit possibly useful? Key to my
thinking at the moment, is I think it's possible, after viewing RTT
inflation, to drop the TTL during a fat flow to find where the bloated hop
is, and although I started drafting the ideas for the new tool here

https://github.com/dtaht/wtbb

I am sufficiently lazy to wonder if it's been done before? And what other
statistics would be useful to try and obtain?

On Thu, Nov 25, 2021 at 8:42 AM Hugo Slabbert <hugo () slabnet com> wrote:

What about some other options?

https://paris-traceroute.net/
https://dublin-traceroute.net/
https://github.com/rucarrol/traceflow

--
Hugo Slabbert


On Wed, Nov 24, 2021 at 9:54 AM Thomas Scott <mr.thomas.scott () gmail com>
wrote:

Ha, my apologies, I thought I was writing this for a Linux User Group,
not a NOG. Ignore my simplistic explanations.
- Thomas Scott | mr.thomas.scott () gmail com


On Wed, Nov 24, 2021 at 12:47 PM Thomas Scott <mr.thomas.scott () gmail com>
wrote:

I have used it successfully in a test environment that I was using ECMP
in. Most of the public networks that I've worked with don't use ECMP as
often as other methods for steering traffic (LAGs, BGP MEDs, etc).

What I have seen it fantastically useful for was troubleshooting a
transit provider, or for when they were congested or had a flapping core
link. Granted I *think *it's still subject to ICMP deprioritization
(most SP's use it prodigiously), and most MPLS cores don't decrement TTL,
but it was still useful to be able to show them "no, at this IP, I
*always* drop traffic, when..."

- Thomas Scott | mr.thomas.scott () gmail com


On Wed, Nov 24, 2021 at 12:23 PM Adam Thompson <athompson () merlin mb ca>
wrote:

The tool fbtracert (http://github.com/facebookarchive/fbtracert) was
mentioned here recently as a way to get visibility into multi-pathing.

Has anyone here ever used this tool successfully?



Supposedly Facebook uses this tool internally, but… that doesn’t help
much.



I’ve tried it on 4 different platforms/OSes (WSL Ubuntu; RedHat;
Debian; OpenBSD), and versions of Go (v1.10 through v1.16), in three very
different environments (on-prem public IP; on-prem NAT’d; cloud public IP),
and I’ve yet to see it produce any meaningful output – each
run/iteration/thread only detects one, single, hop out of the entire chain
of routers, making it less than useful.  Granted, that’s not a full
regression test by any means, but if anyone here has ever used it
successfully, could you please let me know what sort of environment you ran
it in/on?



Thanks,

-Adam



*Adam Thompson*
Consultant, Infrastructure Services
[image: 1593169877849]
100 - 135 Innovation Drive
Winnipeg, MB, R3T 6A8
(204) 977-6824 or 1-800-430-6404 (MB only)
athompson () merlin mb ca
www.merlin.mb.ca





-- 
I tried to build a better future, a few times:
https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org

Dave Täht CEO, TekLibre, LLC

Current thread: