Congressional Web Harvest
As the repository of the official records of Congress, the Center for Legislative Archives conducts a web harvest of all congressional websites at the end of each Congress.
Web harvests began with the 109th Congress in 2006, and the collection is available online. These snapshots of Congress' websites capture the evolution of the web as a medium for Congress to communicate with the public.
The most recent harvest, conducted in December of 2016, preserved over 150,000,000 URIs. The total volume of data captured approaches sixty terabytes. Recent harvests have expanded in scope to capture not only content hosted and stored on Member and committee websites, but also content hosted on a number of social media sites used by Congress.