<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Tooling on jason grey</title><link>https://jason-grey.com/tags/tooling/</link><description>Recent content in Tooling on jason grey</description><generator>Hugo</generator><language>en</language><lastBuildDate>Fri, 15 Nov 2024 00:00:00 +0000</lastBuildDate><atom:link href="https://jason-grey.com/tags/tooling/index.xml" rel="self" type="application/rss+xml"/><item><title>Launch of open source tool: gzinspector</title><link>https://jason-grey.com/posts/2024/gzinspector/</link><pubDate>Fri, 15 Nov 2024 00:00:00 +0000</pubDate><guid>https://jason-grey.com/posts/2024/gzinspector/</guid><description>&lt;p&gt;I published an open source tool &amp;ldquo;&lt;a href="https://github.com/jt55401/gzinspector" class="external-link" target="_blank" rel="noopener"&gt;gzinspector&lt;/a&gt;&amp;rdquo; to inspect gzip streams - specifically those encoded with many chunks.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A robust command-line tool for inspecting and analyzing GZIP/ZLIB compressed files. GZInspector provides detailed information about compression chunks, headers, and content previews with support for both human-readable and JSON output formats.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I did this due to the work I&amp;rsquo;ve been doing for &lt;a href="https://commoncrawl.org/" class="external-link" target="_blank" rel="noopener"&gt;CommonCrawl&lt;/a&gt; - specifically around processing &amp;ldquo;&lt;a href="https://github.com/webrecorder/pywb/wiki/CDX-Index-Format#zipnum-sharded-cdx" class="external-link" target="_blank" rel="noopener"&gt;ZipNum&lt;/a&gt;&amp;rdquo; format &lt;a href="https://github.com/webrecorder/cdxj-indexer" class="external-link" target="_blank" rel="noopener"&gt;CDXJ&lt;/a&gt; indexes.&lt;/p&gt;
&lt;p&gt;If you find it useful, let me know (reach out or star it on github.)&lt;/p&gt;</description></item><item><title>Common Crawl Checker</title><link>https://jason-grey.com/posts/2024/common-crawl-checker/</link><pubDate>Tue, 06 Feb 2024 00:00:00 +0000</pubDate><guid>https://jason-grey.com/posts/2024/common-crawl-checker/</guid><description>&lt;h1 id="enter-a-hostname-see-if-common-crawl-has-it"&gt;
 Enter a hostname, see if common crawl has it
 &lt;a class="heading-link" href="#enter-a-hostname-see-if-common-crawl-has-it"&gt;
 &lt;i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"&gt;&lt;/i&gt;
 &lt;span class="sr-only"&gt;Link to heading&lt;/span&gt;
 &lt;/a&gt;
&lt;/h1&gt;
&lt;p&gt;This checks &lt;a href="https://www.commoncrawl.org/blog/november-december-2023-crawl-archive-now-available" class="external-link" target="_blank" rel="noopener"&gt;CC-MAIN-2023-50&lt;/a&gt; - which was from November/December 2023. I may update this in future to check the latest, but, for now, that&amp;rsquo;s what we have.&lt;/p&gt;
&lt;p&gt;Give it a try here:&lt;/p&gt;


&lt;script&gt;
 function checkURL() {
 document.getElementById('result').textContent = '...';
 var urlToCheck = document.getElementById('urlInput').value;
 var apiUrl = 'https://api.jason-grey.com/check_url?url=' + encodeURIComponent(urlToCheck);

 fetch(apiUrl)
 .then(response =&gt; response.json())
 .then(data =&gt; {
 document.getElementById('result').textContent = data.result;
 })
 .catch(error =&gt; {
 console.error('Error:', error);
 document.getElementById('result').textContent = 'Error calling the service';
 });
 }
 &lt;/script&gt;

 &lt;input type="text" id="urlInput" placeholder="Enter domain name to check"&gt;
 &lt;button onclick="checkURL()"&gt;Check&lt;/button&gt;
 &lt;p id="result"&gt;&lt;/p&gt;</description></item></channel></rss>