Office of Government Information Services (OGIS)

Transcript

FOIA Advisory Committee Meeting (Virtual Event)

Thursday, September 7, 2023

10:00 a.m. (ET)

Michelle [producer]: Ladies and gentlemen, welcome and thank you for joining today's FOIA Advisory Committee meeting. Before we begin, please ensure you've opened the Webex participant and chat panels by using the associated icons located at the bottom of your screen. Please note all audio connections are muted at this time and this conference is being recorded. To present a comment via Webex audio, please click the raise hand icon on your Webex screen that's located at the bottom of your screen. This will place you in the comment queue. If you are connected to today's webinar via phone audio, please dial pound two on your telephone keypad to enter the comment queue. If you require technical assistance, please send a chat to the event producer. With that, I will turn the meeting over to Debra Steidel Wall, Deputy Archivist of the United States. Ma'am, please go ahead.

Debra Steidel Wall: Thank you, Michelle. And good morning everybody. Yes, I'm Deputy Archivist and on behalf of Archivist of the United States, Dr. Colleen Shogan, who was unable to be here today, I'm very happy to welcome you all to the sixth meeting of the fifth term of the FOIA Advisory Committee. A packed agenda awaits us, but before the committee gets to that, I'd like to briefly mention two things. The first is about transparency in government and the second is about artificial intelligence. Regarding transparency, a Partnership for Public Service poll conducted of 800 U.S. adults late last year found that only about 20% of those polled say the federal government is transparent. Hearing that a majority of those polled by this nonprofit nonpartisan organization think that the federal government is not transparent is difficult for all of us to hear, whether you're a member of the FOIA Advisory Committee, a government FOIA professional or a member of the public who faithfully attends [and] tunes into these meetings and the National Archive staff who work really hard every day to make access happen.

And to be clear, making access happen isn't just a tagline here at the National Archives. It's the first of our four strategic goals in our 2022 through 2026 strategic plan, and it really is what we are all about, it's why we get up in the morning. To all of you gathered here today, our collective work, whether singularly or as part of a larger portfolio, is to ensure that the administration of the Freedom of Information Act embodies the transparency demanded by our democracy. So with a challenge such as a perceived lack of transparency comes a great opportunity along with new challenges and that's where artificial intelligence, AI, comes in. Machine learning, a subfield of AI, will be required if we're to automate FOIA searches for the billions and billions of records that government agencies hold. And of course, at the same time, concerns exist about standards of use and thoughtful legal analysis.

The Archivist and I are really pleased that this is an area of interest to the committee, so I'm very happy to welcome a distinguished team from the State Department that's working on several very exciting pilot projects pertaining to machine learning and declassification as well as FOIA. So those of you here from the State Department, we really appreciate you sharing your knowledge and experience with us. The challenges of FOIA and declassification don't rest with a single agency, though several agencies including state are taking the lead, but we must all work together to form partnerships to tackle the government's toughest challenges and that's what we're doing here today. So thank you State Department partners for sharing your expertise with the FOIA Advisory Committee and those of us in attendance. With that, I turn the floor back over to Alina.

Alina M. Semo: Thank you, Debra. I really appreciate those opening remarks and I hope you can stay for a little bit and listen to the great presentations we have today. Good morning and welcome everyone. As the Director of the Office of Government Information Services, OGIS, and this committee's chairperson, it is my pleasure to welcome all of you to the sixth meeting of the fifth term of the FOIA Advisory Committee. I want to pick up on something that the Deputy Archivist noted in her opening remarks regarding a perceived lack of government transparency among those polled by the Partnership for Public Service. This advisory committee and indeed the approximately 1,000 other advisory committees across 50 federal agencies operate in accordance with the Federal Advisory Committee Act, or FACA. The work of this federal advisory committee is particularly important to government transparency because both the committee's deliberations here and the end result of the committee's work, recommendations to improve the administration of FOIA, center on the importance of transparency.

FACA requires open access to committee meetings and operations. That means we will upload a transcript and minutes of this meeting as soon as they are ready in accordance with FACA and the committee's Designated Federal Officer, DFO, Kirsten Mitchell, and I, we have both certified the minutes from the June 8 meeting and those along with the transcript are posted on the OGIS website in accordance with FACA. Also in accordance with FACA, this meeting is public and I want to welcome our colleagues and friends from the FOIA community and elsewhere who are watching us either via Webex or with a slight delay on the National Archives YouTube channel. Committee members' names and biographies are posted on our website, www.archives.gov/ogis.

I have a few housekeeping remarks and then we're going to launch into what undoubtedly will be a very exciting agenda today. First, I just want to note, I am advised that Professor Gbemende Johnson will be joining us around 11:00 AM. And Kirsten, as the committee's DFO, I believe you have taken a visual roll call. Can you please confirm we have a quorum?

Kirsten Mitchell: We do indeed have a quorum. Thank you, Alina. I'm not sure if you heard me, but we do indeed have a quorum.

Alina M. Semo: Okay. Thank you, Kirsten. I just also want to note I see Alex Howard, one of our committee members, on the attendee side. If Michelle could please move him over, that would be wonderful. I want to let everyone know meeting materials are available on the committee's webpage. Click on the link for the 2022 to 2024 FOIA Advisory Committee on the OGIS website. During today's meeting, as I always do, I will do my best to keep an eye out for any committee members who raise their hand when they have a question or a comment. Committee members should also use the “all panelists” option from the dropdown menu in the chat function when you would like to speak or ask a question. You could also chat me or Kirsten directly. But please, as I often note in order to comply with both the spirit and intent of the FACA, committee members should keep any communications in the chat function to only housekeeping or procedural matters and no substantive comments should be made in the chat function as they would not be recorded in the transcript of the meeting.

As I mentioned, we do have a packed agenda today but we are planning to take a short break at approximately 11:35 depending on the pace of our morning. If a committee member needs to take a break at any time, please do not disconnect from the web event. Instead, mute your microphone by using the camera icon. Please send a quick chat to me and Kirsten to let us know if you'll be gone for more than a few minutes and join us again as soon as you can. And a reminder to all of our committee members, please identify yourself by name and affiliation each time you speak today. I am equally guilty of forgetting to do that sometimes, but this helps us down the road tremendously with both the transcript and the minutes, both of which are required by FACA.

Members of the public who wish to submit written public comments to the committee may do so using our public comments form. We review all public comments and post them as soon as we are able if they comply with our public comments posting policy. In addition to written public comments we have already posted, there have been ten since the June 8th meeting, we will have the opportunity for oral public comments at the end of today's meeting. As we noted in our federal register notice announcing this meeting, public comments will be limited to three minutes per individual. Regarding today's meeting, we have scheduled it to go until 1:00 PM rather than our normal 12:30 PM end time. We are allowing for that extra 30 minutes in the event the committee needs additional time to conduct its business. So without further ado, I'm just going to ask my fellow committee members, does anyone have any questions before we move on from housekeeping? I don't hear anyone and I don't see anyone raising their hand, so I'm just going to move on.

So on to our busy agenda today. First, we will hear a presentation from Eric Stein, David Kirby, and Giorleny Altamirano Rayo on piloting machine learning for FOIA requests. They're joining us from the State Department and we're thrilled to have them here today. Committee members, I want to ask you to please hold your questions until the end of their presentation, after which, I hope we will have a robust discussion and conversation with our presenters. After that, we will take a 15-minute break and after the break we will hear report outs from the committee's three subcommittees, Resources, Implementation and Modernization. Co-chairs of each subcommittee will provide updates on their work. And we will close the meeting with a public comment period.

So without further ado, and we're running ahead of schedule, so that's good news, more time for the great presentations we have today, I am very excited to welcome from the U.S. State Department, Eric Stein, David Kirby and Gio Altamirano Rayo to discuss the great work that they are doing at their agency with regard to access and AI. I will be brief in my introductions. Gio has a hard stop at 11:00 AM, Eric and David will be able to stay through until the break to answer committee members' questions and engage in light but collegial banter back and forth, which I hope will ensue.

First, I want to introduce Eric Stein. Eric currently serves as the Deputy Assistant Secretary for the Office of Global Information Services, A/GIS, at the State Department. He previously served as the Department's Director of the Office of Information Programs and Services responsible for records management, FOIA, the Privacy Act, classification, declassification, library and other records and information access programs. Eric has served in key leadership roles involving the Department's improvement of records management and agency-wide FOIA initiatives.

I have had the pleasure of working with Eric for several years now in his capacity as co-chair of the Technology Committee of the Chief FOIA Officers Council. Eric's career at state has included a coordinator of information sharing environment, an inter-agency effort to improve the sharing of terrorism related information throughout the federal government, as well as with state, local and tribal governments and foreign partners. He has worked on several cross-cutting Department-wide programs, including as an intra and inter-agency coordinator on the State Department's efforts to mitigate the WikiLeaks incidents, as a Department's point of contact for controlled unclassified information, CUI, mandated by Executive Order 13556, tribal consultations and other crosscutting Department-wide programs. Eric holds a BA in political science from Boston College and an MA in politics, American government from the Catholic University of America. Welcome, Eric.

David Kirby is currently the IT Program Manager in the State Department's Bureau of Information Resources, IRM, where he is product owner for the Department's enterprise, e-records archive and is responsible for overseeing the development and maintenance of the system. Since joining the Department of State in 2006, David has been involved in supporting several enterprise applications including the State Messaging and Archive Retrieval Toolkit, that acronym is SMART, S-M-A-R-T. SMART manages the dissemination and capture of official reporting cables between the Department and its overseas posts and the inter-agency community. Prior to joining State, David spent seven years at the Department of Defense. David holds a BA in history from George Mason University and an MS in management information systems from George Washington University.

Last, I would like to introduce Gio Altamirano Rayo. Dr. Gio Altamirano Rayo has 15 years of public service experience including the federal government and academia. Before joining the State Department, Gio spent five years at the Department of Labor as well as time as an applied researcher in U.S. academia and as a diplomat at the Ministry of Foreign Affairs in Nicaragua. Most recently, she spent two and a half years as the Senior Mathematical Statistician at the Chief Evaluation Office within the Department of Labor where she worked to democratize reliable and safe statistical and AI machine learning methods; and she led the behavioral economics and human-centered CX portfolio to benefit [the] Department of Labor's 16 sub agencies and regional offices.

In June, 2023, she joined the State Department's Office of Management Strategy and Solution Center for Analytics where she serves as the Department's Chief Data Scientist and responsible artificial intelligence official. A former National Science Foundation scholar, Gio holds a JD from the American University Nicaragua, an LLM from Vanderbilt with a Fulbright scholarship and a PhD in political science from the University of Texas-Austin. She was awarded a postdoctoral fellowship from Carnegie Mellon University. Gio is a level two certified acquisition professional in program management and is certified in privacy by the Federal Privacy Council. Gio is bilingual in English and Spanish and is fluent in Brazilian Portuguese. So welcome, all three of you. We're very happy to have you and I'm going to turn things over to Eric to get things started.

Eric Stein: Well, good morning. Thank you, Alina, Bobby, FOIA Advisory Committee and all the members of the public and FOIA community joining us here today. We're very excited to get into this presentation and just a couple opening remarks. So at State we've been piloting the use of machine learning to do document reviews and we'll be walking through today some of the successes we've had in our declassification program with a pilot that we've operationalized, which is also directly relevant to FOIA because we've learned a lot of lessons on the searching of records, the functionality of machine learning and so forth. One of the themes of today's presentation is partnership and, as Alina mentioned, joined by two esteemed colleagues here from our Chief Data Officer, Office for the Center for Analytics and our CIO's office, none of this would've been possible without the three of our organizations working together within the Department, but also the National Archives, the Department of Justice, the Chief FOIA Officer Technology Committee and of course groups like the FOIA Advisory Council as well. It's nice to be back here to present and really appreciate the invitation.

We're going to look at how we'd leverage technology to address today's common problems we're seeing with the growing volumes of requests for information and the growing volumes of information, which includes data of course, too. And we are going to also discuss how we've considered ethical bias, privacy and other considerations as well as the need for human interaction and interaction in a process so that technology is just not running itself, but rather we are taking into account these different variables and doing risk mitigation to the best extent possible. I understand there are going to be a series of questions and there are already questions: What do we see in other agencies? What have we learned? And there are several lessons learned slides we'll be going through today. The first half of the presentation will explain what AI machine learning we've done here at State in a declass pilot that we've operationalized, but then we will pivot to FOIA and then show the connection of what we learned from the first pilot and why it's directly relevant to the two FOIA pilots underway at the Department right now. Next slide, please.

Let's start with a picture. On the left-hand side, you have a graph that shows classified cables that require review each year. Just by way of background, at 25 years, classified information is up for declassification. Agencies that work with classified information have procedures in place to review their paper and electronic information. For today, we'll talk about electronic information because it's technology. But you see on the left-hand side, 25 years ago, at least for the start of the pilot from 1997 when we were working on it, we had about a hundred thousand cables, which are communications between the State Department and Washington and overseas posts that required review each year. With the existing resources we have, that's a challenge to review that volume of information because these are the actual cables, the page counts are larger. But you look over time, we have a big growth in cable traffic that's going to occur in the next several years. And the question is how will we ever address this declassification review demand, which is also directly related to growing FOIA requests in the volume of electronic records that we have as well.

So if you look on the right side, you have a classified email graph. And when you look back 25 years ago, in the late nineties, agencies didn't have much email and let alone classified email, but we see an explosion over the next several years in growth of volume of records that are going to need to be reviewed. And you look at the Y axis, the one that goes up and down the left-hand side, is different on each of the graphs. On the right side it's half a million. So you see these jumps in the volume of information being generated. And these are just emails let alone there are so many other data sources and records here at the Department. We have a challenge ahead of us and this is what led to one of the pilots we'll be talking about, we'll be seeing this slide later on. We like using this to start, just to show, right now I could say looking at the level of resources, procedures, processes we have in place in technology, without a change, we were set up, we were going to have big problems meeting demands, which we're already struggling to do as it is. Next slide please.

So an overview for today, we're going to do a quick overview of data and artificial intelligence followed by the three examples of the pilots, the one about the Machine Learning Declass Pilot I was just briefly talking about, and then two for FOIA, one about a Customer Experience Pilot for improve the public engagement with our agency website down the road and another with FOIA Search Pilot that leverages the technology we use from the Declassification Review Pilot. Also, there'll be lessons learned after the Machine Learning Declass Pilot discussion and then a lessons learned slide after the FOIA pilots as well so far. The Machine Learning Pilot for declassification was completed, it was October through January, and I don't want to get too far ahead, but it worked and we've operationalized it. The FOIA pilots are actually currently underway. They started in June and they go through February of next year. And we'll explain how this works at the State Department, what my role is, what Gio's role, what David's role is, and each of the systems they oversee and how we made all this work. And then we'll leave plenty of time for discussion as well. Next slide please.

So just starting out, the State Department's policies are in what's called the Foreign Affairs Manual, or FAM, which is publicly available at fam.state.gov. And the FAM is our central policies here at the Department. And they have definitions of terms like data, artificial intelligence, and records. And I think this is important for a couple of reasons to make sure that we're talking about the same thing when you say data or records or artificial intelligence, people have different things that come to mind. So we have a couple slides that'll lay out definitions just so as we progress in the presentation, we have a foundation that we're working upon the same foundation.

In December, the Department issued its data policy, that was from the Center for Analytics and Office of Management Strategies and Solutions, where Gio works. So that's publicly available. And also, in April of '23, this year, we issued our AI policy. And I just want to point out that while my office oversees the Foreign Affairs Manual and I have privacy and other responsibilities, it's really the Chief Data Officer who has primary responsibility and Gio, in her role, over AI and the different considerations for data analytics and so forth. But just as the starting point for this foundation, for the discussion, I'm going to turn it over to Gio. And can we go to the next slide, please?

Gio Altamirano Rayo: Thank you very much, Eric, and good morning to everyone and thank you so much for inviting us to be in this important conversation. So at the Center for Analytics at the State Department, our goal is to inform the practice, to inform the management of diplomacy and to provide insights that drive diplomacy at the highest level. And we call the Center for Analytics CFA for short. And we started about three years ago with three data scientists and a single project. And we've grown since then based on the demand for much larger and much more mature organizations with a charge of enabling a culture of data-informed and evidence-based decision-making at all levels of the organization. So this is super awesome to me because this has been my curiosity my entire life. What works, what doesn't, and how can we do more of what works?

Our leader, Dr. Matthew Graviss, is the department's first ever Chief Data and AI Officer, and under his leader, CFA developed the State Department's first ever Enterprise Data Strategy and the department's AI policy, what we call the 20 FAM 201, AI Policies and Procedure, the Foreign Affairs Manual that Eric talked about, and that is available for everyone to see online. And I'm happy to report some really exciting news - next month we are going to be also launching the first ever Enterprise AI Strategy. We've done this with the help of the AI Steering Committee, which I co-chair with the department's Chief Technology Officer. And the Enterprise AI Strategy, or what we call the EAI Strategy, is responsible for outlining or laying out the framework so that the department can responsibly, safely and security harness the capabilities of AI to advance our work. Now, the key terms here are safe, secure and trustworthy. And I am the Responsible AI Official, so this is the reason why we want to have a plan, a game plan, so that if we do this and when we do this, we do this right. You can follow the updates on the AI strategy launch on our website. You can search for us in the U.S. Department of State Center for Analytics, and you'll be tuning into the kinds of things that we're up to here at CFA. Next slide please.

So as promised, here is the definition slide, and this is super important so that we're all on the same page. Our FAM has these definitions of data, but it also has definitions with other 30 data related terms. So in the FAM, data is defined as the recorded information regardless of form or the media on which it is recorded. One example of data we collect and analyze is staff demographics by race, ethnicity, sex and other variables. These kinds of data allow the Department to assess whether it reflects the rich diversity of our nation. And you can see this data actually online in our DEAI Demographic Baseline Report. You could literally Google that, State Department DEAI, and it will come up. That's in terms of data. In terms of artificial intelligence, the definition for artificial intelligence is aligned with the definition from the National Defense Authorization Act. You can see on the slide that this term is a little bit more complex, so it has five bullets. But to wrap our heads around this, we can just think about AI in how we use AI or we deploy AI in our declassification project. I'll talk about that later. So we can wrap our heads around what AI means, it's a buzzword, but specifically, it's better grasped by an example. Next slide, please.

Now, these other terms are super important because they've been buzzing around in the media and in our collective consciousness. Generative AI, well, there's a very well-known example of generative AI and that's ChatGPT. It basically generates human-like text based on input and questions it receives for a prompt from a chat box. There are other generative AI applications and they can create pictures and audio. Use case is pretty straightforward. When we say AI use case, we're referring to any department application or use of AI to advance our work. This includes both existing and new use cases of AI. AI service is an application or tool that uses AI capabilities from a third party. For example, we have this in our FOIAXpress. And lastly, discriminative AI. Notice that we did not put anything after that discriminative AI, we just placed it there. We left it blank because it's actually not defined in the FAM, but it's an important concept for our work. Discriminative AI is a model that learns and distinguishes similarities and differences in data to predict labels or classifications. I'll refer to this later when we talk about the machine learning model or the AI model for our declassification pilot. Now, I'll pass it on to David for an overview of e-Records.

David Kirby: Thank you. So many of the AI/ML efforts we're going to talk about today were made possible due to advancements that [the] Department made on electronic records management. So back in 2016, we started what we call the eRecords Archive, which was a centralized archive initially to capture email records. We've since then expanded that to include other types of electronic records as well. We've also implemented a streamlined workflow that allows bureaus and posts to retire records to the archive in an easier manner. And we also index all cable records in the Department's SMART archive.

As part of the archiving process, we add a metadata enrichment process that adds over 70 different metadata elements that help aid in discovery for searchers and we have a search interface that allows authorized users to search for emails, files and cables from a single query. The archive is available on both the OpenNet and ClassNet network, which are the department's unclassified and classified networks, and we capture over 2 million unique records every day. Currently, the archive contains over 3 billion unique records. So as you can imagine, that presents a tremendous challenge to search and discovery for FOIA and for other use cases as well. Next slide, please. And I'll turn that over to Eric. Thank you.

Eric Stein: All right, thank you. I know we just provided a lot of information. So just to kind of recap as we position and move forward, Gio, David and I are in three different parts of the State Department, three different what are called bureaus, organizations, and agencies are very different the way they're structured. And I think that's one of the common themes and questions and what I hear is, "Well, can this FOIA solution that you've come up with here or this record solution work in other places?" Maybe. There are a lot of variables and factors to consider. We've talked about those in other sessions before. But for us here, I think one of the questions that was posed to us is, well, what does it take to use machine learning and AI? And what it takes is several things. I mean, on the records front, we needed to have the eRecords archive. That's done a lot. That created the foundation, the data that allowed us to do a lot of what we were able to do. We needed to have a Center for Analytics. And what Gio pointed out is we've had this office for a few years now, and so we needed that in place. We needed to have an AI program just in general here at the Department that had policies and we needed to have partnerships among and there needed to be a will for such partnerships to take risks and try new things. And of course, resources, money had to be spent as well to ensure these things happen. So what we're going to do now is look at the Machine Learning Declass Pilot, do a little deeper dive into that, it went from October through January, and then we'll get into the two FOIA examples, FOIA Customer Experience, as we joke, putting the AI and FOIA, and then FOIA Search Pilot as well and give some of the lessons learned that we've shared publicly in the past about what we've seen so far. So next slide, please.

All right. So just how did we get here? The Partnership for Public Service was previously mentioned, it's an organization that offers a variety of training. Well, it just so happens they have an AI course for senior leaders at the GS-15 or Senior Executive Service or senior level, and it's free for those senior officials. And you apply for it, I think actually the application process is open right now if you Google Partnership for Public Service. And what this course does is over a series of several months, if you're selected for it, you go to four hours of training once a month and learn about the different building blocks of policy on AI, considerations to operationalize it and probably most importantly, you get the opportunity to collaborate with other senior leaders who either maybe are learning about AI for the first time or those who are experts in the actual work of machine learning and AI.

And so from my experience in October '21 through May of '22, in this course, I realized we had through eRecords and the Center for Analytics an opportunity and all the tools we needed to maybe try something involving the declassification of records that are 25 years or older. So what we're going to do in a minute, I was going to turn it back over to David to explain how our process used to work with eRecords and what we've done. And I'd also just start out by saying prior to this machine learning pilot, the State Department would take a very manual review of records they want to declassify for ultimate public release, meaning someone would sit at a computer and click declassify, declassify, declassify, may need to remain classified, declassify and go through each of those a hundred thousand records from the first slide that we showed manually. And so what I thought about was the results from our declass reviews typically show 98 to 99% of what we review from 25 years ago gets declassified. So why are we committing so many resources to doing review of records where this much information is actually getting declassified? Is there another way we could do this review, of course with human quality control steps in place, so that we can reposition our resources to address other demands for information from the public, from the various constituencies we serve and so forth. And that's the foundation of this machine learning pilot we're going to talk about today.

Next slide please. And with this, David, I'll go back over to you to kind of talk about what we built eRecords and so forth.

David Kirby: Thanks Eric. So the eRecords platform that I talked about earlier currently supports over 25 different use cases across different bureaus in the Department to include FOIA, litigation, historical research, diplomatic security investigations, and many others.

For this specific declass effort, a few years ago we developed a separate module for eRecords to assist with that manual declassification effort that Eric talked about. We call this the ADR module and it helps by automatically queuing up the records that are eligible for review and includes search capabilities and other features to help streamline that review process. But again, as Eric mentioned, up until now, it's been very much a manual effort.

So with that, I'm going to turn it over to Eric. He's going to talk about the machine learning pilot. Thank you. Next slide.

Eric Stein: All right, so we're back to this slide again. We're really close to talking about machine learning in AI. So this is where Gio will take the stage very shortly and walk us through the actual technology, kind of looking inside the brain of how this happened, how this worked. It's typically a question we get asked, "Well how does this work? Explain it to me," and we will go through exactly how that science works very shortly.

So here we are again. In the context, we are here. So at the start of the pilot, we were here, maybe we should be past tense. We had a 100,000 or so cables that needed to be reviewed and we thought, "Well, wait a minute, what if we took the results from maybe 1995-1996 declassified records and used that as a foundation to train a model to do a review. So in other words, we assumed that our human reviews from 1995 and 1996 were perfect, which is of course a risk because nothing's perfect. But we had baseline data of humans who did complete reviews of these electronic records, of these records, and we trained a machine learning model to do a review from that. And that is how we started. So it's not like we just started with feeding a bunch of information into a system and saying, "Now make recommendations." This was the baseline of everything we've done here is a human decision.

And in this process, not to get too far ahead, we've actually learned through the technology where we can improve our process in our declass review, where maybe we can collaborate better with other partners, other agencies, terms we may want to use, better quality control steps, maybe steps you can cut out or steps that need to be added in the review. And that came from the perspectives of a team of data scientists looking at this and making very objective assessments of, "Well why do you do it that way?" And sometimes it was a little bit humbling. Well, that's interesting, because we've always done it that way, which is not a great answer. So we needed to look at maybe some process re-engineering as we went through this as well. Next slide please.

All right, so this was the pilot proposal from the actual course. It's always humbling to go back and see your own work, but pretty much to summarize this, it's what was the challenge? How could we use technology to review records where we year-over-year declassify 98 to 99%? And would this even be possible? Could we even train a model to do this? We had to identify key stakeholders and partners. And then, if we can go to the next slide please. And then we had goals and objectives, and ambitiously, I really wanted to start this in June of '22 and finish in October. But again, with other competing priorities, we were able to start this in October. So it slid a little bit to the right and then go through January of '23. Next slide please. So at this point I'm going to turn it over to Gio to explain what we did actually with the machine learning, the data, to make all this work. So Gio back over to you.

Gio Altamirano Rayo: Thank you very much, Eric. I'll talk a little bit about AI. AI, a big buzzword, but within AI, if you see that as a big round circle, there is a subsection of that circle that's called machine learning, and I can talk about how that machine learning methodology was used in our declassification process.

So we talked about how our reviewers would manually classify cables as exempt or declassified, right? And it's literally manually going over the thing and then just saying exempt declassify. That's like a repeated task and it's one specific repeated task over and over and over and over again. So rather than do that, we trained a model, or basically an algorithm using human declassification decisions made in 2020 and 2021 on cables classified confidential and secret in 1995 and 1996 to recreate those decisions on classified cables in 1997. So these decisions, remember, were made 25 years after the original classification of the cables. So we grab that and we basically labeled.

Our model, our machine learning model then detects patterns and predicts those labels for new and undecided cables. So we have a corpus, we have that labeled, we train a model with that, and then the model can predict the labels of a new set of cables. You can see the example results on the right. These are a result of basically discriminative AI, the term that I mentioned earlier in the presentation. Over 300,000 classified cables were used for training and testing during the pilot. The pilot took three months and we had five fantastic, wonderful dedicated data scientists to develop and train the model. Next slide please.

Thank you. So just to go a little bit in depth of what this all means and how our model can predict the next batch of cables that you give it, for every cable that's being processed, our model, our machine learning model outputs a confidence score from zero to one. In the second step we created a threshold or cutoff scores to base the classification decision. So if you look at the bottom half of the slide, you can see the threshold and their associated predictions.

So anything between zero and 0.10 would most likely already be the classified cables. Between nine and one will be most likely accepted cables, so those would remain classified. For the cables that are in the middle of the threshold, the model is unsure and those would require a manual review. So just to be a hundred percent clear, this AI machine learning technology does not replace human reviewers, but we can augment our work by creating these three buckets, the declassify cables for sure, the exempted cables, for sure, and the ones that are in the middle, which is the bucket that humans need to review and look at closely. So we can leverage this AI technology to do the tedious parts, but leaving the critical decision making to our staff. Next slide please.

So for this pilot, and like Eric said, we started in 2022, we used cables from 1997, and our total set of cables was just over 78,000 cables. This pilot used both the model and manual human review and this provided a baseline for us or a reference point for us to understand the effectiveness of our model. In the table to the left, you can see the breakdown of the cables that were analyzed with our model with the top row showing the number of cables that were correctly classified. And as you can see, a large majority were correctly labeled to be declassified with a small error rate. As expected, many also required a second step of manual review and a small minority were labeled as exempt. Our pilot program, we basically achieved something around 96% agreement with human reviewers while reducing up to 63% of the burden of having to do this manually every single time. Next slide please.

For what we call the cable set, so basically the corpus of cables from 1998, we fully operationalize our model. So that's why you see a lot of questions marked on this slide. We don't have an error rate, threshold accuracy, none of those metrics, because we're not comparing the model to the human reviewer. But it's important to remember that just because we operationalize our AI model, it doesn't mean that there are no humans in the loop, there are no human reviewers in the process, in the entire process. For the 47,000 cables that were not classified, those were the ones that were in the bucket. Those in the middle bucket, those do need manual review. So we also have a random cut or a random subset from the declassified and exempt sets to make sure that it's good, that it's accurate. So this is basically our quality control process, and this is going to help train the model in the future. So there's always improvement in erasion as it occurs in time. Next slide, please.

As Eric was saying at the beginning of this presentation, to us, it is super clear that our manual review process of cable is not sustainable because of the burgeoning, like the surge of information that actually occurs from one moment in time in the past to the next moment. It's really incredible that the scale at which this search occurs is really, really high. So the exponential search, this small scale pilot offers us a proof of concept to scale and integrate this technology into our routine declassification process. So we'll apply this model to the 1998 cables and then we will use this process in future years.

During the pilot, we learned that collaborating with the Department's Office of the Historian would help strengthen future declassification review models, too. So in future iterations of this, we could provide input about world events during the years of records being reviewed, helping the model be more accurate, be more precise, so basically maturing the model as we go along.

Another thing that we've been thinking about is auto-redaction as a key solution to quickly release documents and reduce manual input. These processes of course are going to take time to operationalize, but this will give us time to mature our model for the incoming cables. Like Eric said, in 1997, remember that table that he showed, those two pictures that he showed in 1997-1998, when the amount of emails that were there, it doesn't exist because at that time the State Department didn't even have email. Now think about doing this declassification with the advent of email.

So when we're dealing with future years, the number of classified emails doubles every two years, rising to 12 million emails in 2018. So it is human impossible, it's not possible for a human to actually review this manually every single time.

So in addition to that, we also have other ad hoc projects that will help with the declassification process and we'll also take what we've learned from this pilot into our FOIA process. And with that, I'll pass it on to Eric.

Eric Stein: Great, thank you Gio. If we go to the next slide please. All right, so all that as a backdrop, a couple things relevant to the FOIA community here. From this review process, we are going to be able here at the Department to publicly release cables through proactive disclosures in FOIA. So in other words, the records will ultimately, copies will ultimately go to the National Archives, but we'll be able to post the results from these reviews onto our FOIA website, and plan to do so later this year. So we're looking for ways directly to start informing the public. It's not just we're doing these reviews, but how do we get information out to the public. So we're very excited about the volume of information that through proactive disclosures will be going out starting later this year in addition to our release to one release to all policy on FOIA that we've had in place for years.

In terms of lessons learned, in no particular order here, you need quality data and with a similar data set, like one set of cables, there's just cables which are standardized and look and feel, things worked well. But we started to find challenges when new data sets were introduced. Let me give you an example. If you have an email with two or three different types of attachments to it, this model would probably start to break down because it was developed specifically for cables, and we're looking now at how to apply it to other records, here we have State emails and memos, and other things that we have. And all of that requires training. So it's not like, "Oh, we got AI, great, and you can go use machine learning for everything now." And that kind of plays into another point we have on here about starting small, which I'll get to in a moment, too.

Partnerships are critical to success. I've said that a few times, but I can't emphasize it enough. Working with David, Gio, and our respective teams is truly a delight, and I think that we all learn from one another and it's been a great partnership over years.

Starting small, so identify a project in a scope and be open to results and feedback. You may fail. And so, one way or the other, we accepted [it] from the beginning. If this wasn't successful, we wanted to share that with inter agency colleagues about maybe what not to do or what didn't work for us to see what could work at other agencies. And this has led to dialogues in different inter agency communities on what's working, what's not working, and what are you thinking about that may work in your institution. Be open to results and feedback and that includes new approaches, which we talked about a little bit before. Patience. Avoid jumping to conclusions and quick judgements. “It works, let's use AI for everything now,” which is the tendency that, "Oh, let's just use AI” and that may not be a solution that works. Develop quality control checks for results and that includes human review. To Gio's point before there's a human quality control check in all three of those buckets. So there's what could be declassified, what needs to stay classified, and of course the manual review.

And then recurring sustainable success will require ongoing training of a model using inputs from humans and technology. So it's just like training an employee, you need to have ongoing training of a model, because we don't want to say, "Well, everything about this specific country, just because we released it 25 years ago, can be released today." Or everything on this topic or everything on this matter. I mean, in a heartbeat, something changes and all of a sudden something that wasn't so sensitive now may be. So there needs to be room for nuance and continuous improvement as well. Let's go to the next slide please. And now we're going to really get into FOIA.

So with the successful pilot and operationalized behind us on declassification, also relevant to FOIA for the use of B1, to exempt information or to redact information or declassify for public release, we thought what worked well from this declassification pilot that could apply to work in the FOIA community? Here's what did not work, what we found, we're not at the point of applying redactions, yet. So the FOIA with the nine exemptions is so nuanced, especially when you start getting into B3 statutes or considerations on a foreseeable harm in B5. I think those are the fears that people have in the public that we're just going to apply a model just like B5 everything and nothing will ever come out again. That's not what we want either. So when we talk about looking at the nine different FOIA exemptions, and so B5 has to do with deliberative process and there's a standard called foreseeable harm involved in it. We want to make sure that if we're training a model to apply redactions, it actually can do that well. So far only it works, technology, and it's not even machine learning necessarily, works well with email accounts, names, and some other privacy data. It's not quite to the point of actually becoming cognizant of, "Oh, this might be a nuanced situation where we need to," it's just not there, in our experience so far here.

So what went well, we found that the technology has a great way of sorting large volumes of data and information and making connections and seeing things that we may miss when we're manually going through large volumes of data records and information. So our first, in no particular order, FOIA pilot is about customer experience in our public websites. And we wanted to think about “how can we improve our public website?” We get assessed annually by the Center for Plain Language on our public websites. FOIA was one of them at the State Department recently. So our thought was could we automate our process of engaging with the public, helping the public to maybe find existing records that are already available to the public, and then automate a customer engagement early in the process.

So you come to State site hypothetically, can say you're looking at for something on a specific country topic and so forth, and as you type it in, it pops up, "Here are some records that have been released," which may either satisfy your request or help you may be narrow or identify what you're looking for. "Oh no, I don't want this, I want that instead." Or even look at, "Here are a list of pending requests related to that. Would you like to be updated as we release records on this topic?"

These are things we're looking into. It's a pilot right now, so we don't have results on this specific effort, but we saw the capability of maybe taking all the data we have on our FOIA.state.gov website, taking those records and maybe re-indexing it, playing with the data, making it more user-friendly, and then by looking at other agency websites that have been viewed as favorably through the scorecard here, and other just observations, what could we do to improve the experience with the public? Because at the end of the day, we're working to respond to public requests and all of these records, while their federal records, ultimately belong to their government, but the people. So we want to make sure we're only holding that which we need to hold as long as possible and releasing as much as possible on the other side of that. Next slide please.

All right. And so here we have the second FOIA pilot, and this is grouping similar FOIA cases and parts of cases, like one search for many cases. So right now, this is one of the, as a request comes in, here at State, we try to identify similar requests just for processing purposes. But what we're thinking of is wouldn't it be great if something happens, an event happens, we tend to get a lot of requests on that topic or event. So if requests are coming in, if there's a way to tag that request or do something, or maybe look at requests or have something on the back end grouping the requests, could we maybe do one big search to address all of those topics and help get information out to individual requesters or prioritize if this is about that topic, but for one record. So how do we work through very similar requests in the past which have each been done individually, maybe by a team or the same group of people, but not necessarily leveraging technology to work smarter.

So we want to reduce duplication of efforts, the adequacy of speed and search time and look at different ways that we could possibly use technology to say you place your FOIA request on our website. The other thing that happens is someone approves that and then there's a manual search. What if you place your request, we approve it and then it just searches our eRecords archive, you heard us talking about earlier, pulls the records up, identifies these are likely using machine learning, discriminative AI, these are potentially the most responsive records. And then we can get started on the review right away. Again, we're not quite at the point of redaction, but that's kind of the vision. We're not sure we're going to succeed, but we're going to try.

So I think that sums up where we are on these pilots right now. If we can go to the next slide and then I think we open for discussion right after this. So lessons learned to date, just key themes from both the FOIA pilot and from the thinking about the FOIA pilot and the other declass model. Managing data and records is critical to success. I won't read all of these, I'll leave this up here for a moment. Starting small, taking risks, considering bias, being open to results, sharing results, and knowing I think the last bullet, results in one pilot or project may not be applicable to others, but then again, they may. And you want to look at how we can take what we learned from one place and apply it somewhere else and then share that to help other agencies or institutions try to achieve similar results.

And with that, I think if we can ... we'll leave this up and we won't go yet to the next slide, which just has a contact for State, if you want to reach out with FOIA feedback. One of the questions was how do you get FOIA feedback to us? We always have an avenue for feedback available to the public through State's FOIA website at foia.state.gov, and we will share that link in a moment. But here we'll pause and take any questions that came in. I don't know if they came in the chat or the panel has, I think Alina, you said you didn't want them in there, so back over to you to facilitate the discussion. And thanks again for this opportunity.

Alina M. Semo: Thank you. And Gio, thank you for joining us. I know you have to leave at 11, so we've got two minutes, if anyone has any questions for Gio before she has to leave. I'm going to ask committee members if they've got any questions. All right guys, not all at once. Lauren.

Kirsten Mitchell: Lauren Harper and Jason.

Alina M. Semo: Yep.

Lauren Harper: Thank you so much. Really appreciate this presentation. My name is Lauren Harper and I represent the National Security Archive. We're a group of historians that regularly request a lot of State Department cables older and more contemporary, so this has had a big relevance to us. I'll try and be quick because I know Gio has to go soon. My first question is about the FOIA pilot that's currently underway, and I'm wondering if there's a component of that program that learns from FOIA records that have been contested, whether that's something that's been appealed, re-released with a different response, or documents that have been litigated.

Eric Stein: That's a great idea. We're at the beginning of the process, like search and sort, but that's exactly the type of thing we could train a model to do, to look at those decisions where there's been an appeal and looking at how many of those are overturned and maybe that gets the process improvement. So no, we haven't done that yet. It's a great idea we'll take back.

Lauren Harper: And one more quick question, very quickly, is there a plan to incorporate this pilot into FRUS (Foreign Relations of the United States) production, or is that something that's already being done? I know you mentioned there were a bunch of programs that this had applications for. So if you could answer that, that's also a big curiosity for me.

Eric Stein: So while we play a role in the FRUS production, I have to defer to Adam Howard, who's the historian in the historian's office, but we have...Adam was involved in this, the historian, and they're in discussions, they're aware of the technology, so I don't want to speak for them, sorry. But I have to stick to FOIA in my own domain.

Lauren Harper: Understood. And thank you very much for your presentation.

Gio Altamirano Rayo: Well, I have to jump everyone, but thank you so much for inviting me to present and to speak in this important forum. I hope that you all have a wonderful rest of your morning and afternoon. Bye.

Alina M. Semo: Gio, thank you again for joining us. We really appreciate it.

Gio Altamirano Rayo: Thank you.

Alina M. Semo: Eric and David, you guys have to stay. So I think I saw Jason and Luke's hands pop up, but I don't know who was first. So would you guys like to let me know? Luke, were you first?

Jason R. Baron: I'll defer to Luke.

Alina M. Semo: All right, thanks. Luke, go ahead.

Luke Nichter: Thank you Jason, and I'll be brief too. So Luke Nichter, a history professor at Chapman University. And so my interest is as an end user of FOIA, a little bit of overlap with Lauren's constituency. And so my interest is, I wonder if you could talk just for a minute more about what you learned in creating these models. Because I think, probably, so I have a little background creating train models to automate transcription of White House tapes, Kennedy, Johnson, Nixon, and I think there are things that surprise you that work well and there are things that surprise you that don't work well. And even in the case where you might find that different administrations have a slightly different vocabulary, subjects of interest, and when you create the model, of course you want to make sure you're using what you consider an average or a baseline of data, because if you choose a sample that's too unique or too specialized, it might not work elsewhere. So I'm just curious to know, you might have an example of what really worked well or a challenge, something that didn't work well, but thank you very much.

Eric Stein: Sure. In March we briefed the Historical Advisory Committee too. I just wanted to share, I mentioned the historians. So we briefed a group of historians and different academics from all types of different historian, political science, and other fields on this capability, what we were doing. In fact, we went public with it in our annual Chief FOIA Officer Report in March. So this information has kind of been out there about what we've been doing. So I just wanted to share that we have socialized it. To go to your questions, what worked well, what didn't, what were surprises, is I think when we first started, one of the things that happened was we took these, in 1997 ... Well, for the review of the ‘97 cables, we had a whole baseline year also human review and concurrently ran the machines, and the first few results were okay, 50%, 60% accuracy. And we thought, "Well, this could go either way. We'll keep trying."

I think what we learned is just, as you pointed out, having the right terms and thinking about what are we saying is so sensitive 25 years later that can't really be released? And has anyone given a real good hard look to that? And I think coming up with that combined with what's the public interested in, too. I mean, we should be reviewing all of these things and getting them out. And whether it is for the National Security Archives or any historian or anyone interested in records in general, if we can proactively start putting information out, more information out, we can also get more feedback. This is useful, this is not. Can you do more with this? Can you do a little bit less of that? Because all of it comes down to…so what we learned was I think it was the patience of, and kind of sticking with it, because when we started seeing these 50, 60% results, it then got into challenging process and looking at those different areas. So I think that was probably one of the areas.

The other thing, and this what we learned is more of looking ahead to where we are now, with emails and other record types, we're going to have different challenges. Because what do you send to another agency to review in the FOIA context for a referral or consultation? If you train a model to say anything with an email, the to or from, you have a lot of wasted time sending things to different agencies. So I think in terms of success, I'd say we stuck with it.

In terms of a failure, I don't think the model was extremely successful at always identifying what needs to go to some other agencies. And that was because the volume of data is so much smaller. And another interesting thing, some of the anomalous results that happened at first weren't even because of the records, it was because of the data. So we kept getting these quirky results of certain records like, "Well, these look completely fine to declassify and release, and it was actually on the backend and the data needed to be, there was an issue with the way the data was structured. So I guess that's how ... I don't know, David, you were really involved in a lot of this too, too. I don't know if you have any other examples on what went well or didn't went well for eRecords, but just maybe tap into you for a second here if you think of anything.

David Kirby: No, I think that's good. I will say that the cable records were a great place to start because they are so structured. They all have the same kind of headers, the same format. They've got tags, they've got captions. So that's made it a lot easier. I will say that the data from back in '97, '98 wasn't as good and clean as the data we have now. So we did have some cleanup we had to do with post names, embassies and consulate types, things like that, that has a little bit of a challenge to them.

As Eric mentioned, once we get into emails, we're starting to look at file records now, which like memos and other correspondence that's already introduced in some challenges. Once we dip our toe into email, it's going to be a whole different ball game because we've got attachments that can be any kind of data in them. So starting smaller cables was the right approach, I think.

Alina M. Semo: Okay, I think we've got about five different committee members queued up for questions. I'm thrilled. And I think I'm calling on the correct order. Jason is next, followed by Adam. Adam, did I see your hand up or did you put it back down? Okay, it's up. And then I've got Gorka, Stefanie and Patricia. So hang in there guys. Jason, you're up next. Thank you.

Jason R. Baron: Thanks, Alina. Eric and David, this is tremendous. You're doing cutting edge work for the federal government. A hundred percent supportive. I have three questions. I'll try to be brief in setting these up.

The first is to what extent you have been engaged in any kind of partnership or working with the eDiscovery community? As you know, Eric, and we've talked before, I've been on a soapbox since 2006 when I helped create the NIST, the National Institute Standards Technology, text retrieval conference legal track, where machine learning was compared with keywords and manual searching.

So on the order of a couple of decades, lawyers have been working towards using machine learning methods and there are very up-to-the-minute techniques, including continuous active learning that don't involve the massive training that you went through in classification to really make a difference in terms of responsiveness in search. Not filtering, but search. And so my first question is to what extent are you aware of and working with the eDiscovery community, the legal services industry, in connection with your efforts?

Eric Stein: All right. I think, Jason, good question here. I'd go through a couple things. When we created eRecords, it was part of the OMB (Office of Management and Budget) and National Archives, NARA, mandate to meet 2016-2019. And building up to that, we did a lot of research, market research about tools that are out there in eDiscovery. I know I'm going back in time here, but just bear with me for a second. We saw what was out there and there were some amazing tools back then that were out there. Also, they just get very expensive. It comes down to what can you afford? What were the budgets and could they come onto department networks, infrastructure, FedRAMP (Federal Risk and Authorization Management Program) approved. You get into all types of IT challenges.

So are we partnering with anyone in the legal services industry? No. But I do know we have attorneys we've worked with who've come from the private sector and industries who've talked about different capabilities that they have in those different law firms. We have looked at different tools that are available for eDiscovery. Since then, we have looked at different technology and a lot of it comes down to, and it's on one of the slides, AI plus AI plus AI doesn't equal super AI. We have a situation where we have certain machine learning capabilities and AI and eRecords, which is terrific, but then that system's interoperable with others. So we start talking about e-discovery tools for us here at State. We looked at if we start layering these tools on top of each other, could they affect one another? Do we actually get a worse result in the end? Do we actually have problems of moving something from one system to another and so forth? So not much partnership. I would be interested if there are any type of specific standards or things that you thought would be worth looking at. Of course, please submit them so we could look and see what's out there. I know you have a lot of depth and experience in that area.

Jason R. Baron: Thanks Eric. I would suggest an RFI (request for information) from State to reach out to what is sort of state of the art out there, but we can have an offline conversation. The second point I want to make is that you may be aware of, Committee members are aware that I've been engaged in research both at the University of Maryland, and I didn't identify myself as a professor there, but also in partnership with the Mitre Corporation, especially on B5, on the deliberative process privilege and the research that we have done, based on the Clinton administration and presidential record collection, is that machine learning methods are about 70% accurate with respect to sorting ranking documents that have portions that are either within or outside of B5. And so I wanted to make you aware of that and we can have a further conversation, but there is current research that might help with the question of sensitivities and redactions. That's my second point.

The third is a contrarian question that I have about the classification effort. I think the 300,000 training is tremendous. It's a ground truth on 300,000. It's a larger data set than anywhere else that I've ever seen in the information retrieval community. But let me just ask a devil's advocate question. If you are 98% accurate in sorting documents or at least finding documents that are either classified or unclassified, and you have a million documents, that means 2% are inaccurate and that means 20,000 errors. It wasn't clear to me, in the ranking scheme when you did the three buckets, how many of those errors are in the tier where you would presumptively review subject to a sampling of human quality, human reviewer input. So what do you say when a deputy secretary, you've released something within that 2% and a subject to the sampling and somebody comes up and that becomes the Washington Post story of the day because some classified material has been released, missed both through the automated method and through the human review filter. Are you anticipating that that is a potential bad day in your future?

Eric Stein: It's a potential bad any day I could have that situation, I think occur. Mistakes happen and I'm not trying to downplay. Yes, you're right. I guess I would go back with another contrarian question. Should we just not release anything then just keep the status quo? So it's a real problem about what the risk appetite is and what I tend to find is we never want to release anything classified or sensitive, but as we see events unfold and something we've released 40 years ago, could all of a sudden be sensitive again today? I guess I'd go back to the slides. I think they'll be shared or posted, but just to be very clear about the three buckets, if it said automatically declassify, or proposed for declassified, that's one bucket. The one that said “human reviews required," those all go through human reviews and humans make mistakes too.

And then the ones that said keep classified, all of those would get reviewed. They were so small, it was 800, 1400, and that's where we actually learned some of the lessons that some of those were actually data issues or so forth. I don't have a great answer for you, Jason, actually, on that one, that's a tough question in terms of what do we do? It's just we put controls in place and before we post the records online, there will also be a final scrub for, while we check for, PII and privacy information as part of this review at the onset. There's some other statutory and other information that could be included so we're learning about other sensitivities. We never want to release anything we shouldn't, but we also know there's an obligation to be transparent with the public with these records. So I want to pause. I really would be interested in any follow up questions or thoughts on that.

Jason R. Baron: Well thanks very much, Eric. And we can talk offline.

Alina M. Semo: Jason, thank you. Adam, I believe, feels as though his question has already been covered. I'm going to go over to Gorka Garcia-Melene from NIH. Gorka.

Gorka Garcia-Malene: Thanks, Alina. Well, first and foremost, Eric, Gio, and David, great presentation, incredible work. Like Jason said, you really are at the cutting edge of FOIA and technology. I have two questions. The first one, and I understand both humans and machines make mistakes. That's just how it goes, right? So from the error rates in your deck, which is, by the way, on NARA’s website already, thank you. It looks like the model leans toward protecting information, right? And I guess what I'm wondering is, how do you think about what machine learning error rates you're comfortable with?

Eric Stein: Yeah. I think that's very similar to some of the points Jason was kind of touching on with that 2%. If there's 2%, it could be a large volume. Let's just talk about what we did before this project. We use Boolean logic, so it was and/or searches. And so we have a universe of say this many records. We would come up with what we call dirty words or key terms that we go through, some of which are classified themselves to make sure that we flag these records and documents. And sometimes you'll find that certain acronyms are parts of a word and there'll be false hits. So there was a lot of trial and error back in how we would do it before as well. And what we learned through this approach, the new approach is that we'd actually get a better understanding of the connections made among certain records sets and I'll give you an example.

If someone calls me Eric, Eric Stein, Mr. Stein, DAS Stein, DAS Eric Stein, the DAS, or different title specific, the new model will pick up on that, the old one would not. It would just use, literally, whatever we put into the search results. So the review and what we're finding is more accurate in that regard. There will be blind spots likely in any type of review that occurs of a large volume of records like this. There could be a market for a secondary machine learning model, does a QC (quality control) maybe after this. Maybe that's something we could look into down the road, a secondary initial review and things to look for. So there's a lot of potential down the road and it comes down to in all of this, our staffing levels have stayed the same, and as a result of this, we were able to take on some additional work and do some different things that were more stagnant or maybe not moving forward because we had to commit so many resources to this.

So I think this gets into the dialogue with the public and others, the historical community and so forth. What are you interested in seeing, and just in general, the requester community moving to more the FOIA or broader transparency, what are you interested in seeing how we do it? And I think I want to turn to David for a second here because what we developed for the machine learning declass model, we're folding that into the tool we use now. It's actually, it's our process, it sorts, and then we were able to use that result to help inform what we review now, correct? It worked within our infrastructure and our ecosystem is what I'm trying to say.

David Kirby: Yeah. So we're actually taking the model, running it through the cables before the reviewers even get their first pass at it, and kind of pre-bucketizing it for them. But the reviewers have full control to view everything. So they want to go and do a deeper dive into the exempt category or the declass, they can do that. I do want to mention one thing because you talked about error rates and one thing I thought was interesting early on in the process is when we ran through those ‘97 cables that were already a hundred percent human reviewed, we sat down with the reviewers that said, "Hey, here's the ones that had a conflict between what our model said and what the reviewers said." And we found in many cases the reviewers actually agreed with the model and not the previous manual review. So you can have instances where one reviewer looks at a document and another one and they get completely different results. So even though we may have a 2% error rate on the model, that may be better than the error rate we have with human review anyway so keep that in mind as well.

Eric Stein: Yeah, David, excellent points. I think going back to Jason's point before too, there has been a day here before where I've looked at a case in a FOIA request where someone said we had to deny an entire record because it's classified and as Kirsten put in the chat, B1 is the FOIA exemption for, so deny the whole record B1, but then a different reviewer said, "You know what, you could release that." And because the record came up again in the case and then under the executive order for classification, when in doubt, we checked with the experts and ultimately we could release that information, which was terrific. So in terms of this also questions, this approach helps us to maybe question just a little bit that presumption and concern I think a lot of people have about over classification and so forth. If information is either declassified already or can be declassified, maybe that's an area where this could apply to the FOIA community as well, or sensitive information, sensitive until this moment of time maybe then it's not so sensitive anymore until a specific event or so forth. So there are a lot of ways this could be applied. I'm actually excited to see what others come up with in this area as well.

Gorka Garcia-Malene: Thank you. And you've addressed my second question, which was have you gone back to make sure that it was the machine that made the mistake and not the reviewer? And I mean as is, your error rate is impressive. I mean out of 48,000 cables that were recommended for declassification, 350 were into the mistakes so that's just really impressive. Thank you both.

Eric Stein: Thank you.

Alina M. Semo: Thanks, Gorka. Let's see, Stefanie I believe is next.

Stefanie Jewett: Thank you. Stefanie Jewett. I'm from HHS OIG (Health and Human Services Office of Inspector General). It's some sort of a question that's not off-topic but a little bit different. I'm definitely a huge supporter of this AI development to deal with the ongoing FOIA search and the lack of resources, that's always an issue. But I'm curious what you would say to those agencies who would not be able to create a homegrown system like this and would have to rely on the private sector for the software and, therefore, the private sector would be the ones training the AI and, therefore, essentially making the initial decisions and removing the government from those initial decisions. So just curious about what you would say to the critics that would say, "well, you're removing this entire process from the government now because there's several agencies that would not be able to do this."

Eric Stein: No, that's an excellent question. I know exactly what I would say to anyone who asked that. Come talk to us in the Technology Committee under the Chief FOIA Officer Council. And we have a bunch of experts who have been through similar situations. We may find that what we're going to tell you at state may not be as relevant to what your agency's doing. But a couple other thoughts. We've had agencies come to us and share these are the challenges we face. Maybe there's not budget buy-in, we don't even know where to start. How do we gather requirements? So we would be able to advise on here are requirements you might want to put into a contract that you're looking to have with a private company. The requirements of maybe custody of data, how it's used, how it's trained, what input you have. I mean some solutions, COTS (commercial off the shelf) products, and others are just going to have, this is how you have to use the tool and there's not going to be much wiggle room.

Others may have more flexibility. Sometimes with that flexibility though, you start breaking other things. So I would say come talk to the Technology Committee. We have a group that works on AI and search and they would be the ones probably to start. If that wasn't the group, we'd go to the broader 40, 50 plus members and try to find someone to help that individual.

Stefanie Jewett: Thank you.

Eric Stein: You're welcome.

Alina M. Semo: Patricia.

Patricia Weth: Hi, good morning. Patricia Weth from EPA. I wanted to thank you for this presentation. As I heard you speaking about AI, I thought about my first days in FOIA, fresh out of law school, making redactions with a magic marker. Those of you who don't know what a magic marker is, it's a sharpie, and then we would photocopy it. And for my friends who were at agencies who had more resources, they were using an X-Acto knife to redact and then the photocopying. So the thought of down the road using AI actions and FOIA records is really exciting. And we, in the federal government, we have limited resources and we're all trying to work smarter or not harder. I'm just wondering if you could kindly talk a little bit about your program. You spoke about it at the beginning of your presentation of how federal agencies could participate in your program or benefit from it, and it was a pretty quick discussion. I was just wondering if you could talk to us a little bit more about that?

Eric Stein: Sure. I just want to make sure I have the question correctly. Is it the AI course I mentioned that I took earlier?

Patricia Weth: I am not sure. It looked like at the beginning it went pretty quick, but at the beginning of your presentation it looked like there was...perhaps that's it. Perhaps it was the AI course.

Eric Stein: So the Partnership for Public Service has an AI course for GS-15s and members of the Senior Executive Service. And here at State, the Senior Foreign Service, just senior level leaders that socializes artificial intelligence and executive level to think about policy considerations like bias, ethics, how do you develop a program, who are the partners to talk to and so forth. There are many other great ones out there. That's just the one I've personally taken. It also had a nice price point, it was free. So I know several people have taken it and are taking it right now. They found it rewarding. There are a lot of resources out there now, so I think if you ever wanted to talk about it, please reach out to the Technology Committee. We have members who've taken it there. We can put you in touch with others, but if anyone else is aware of other great resources, they don't just want to support one, I know there are many out there.

I think the most important thing is just to raise awareness. Prior to that course, I took advantage of our Bunche Library. It's our oldest federal library. So here at State we're very proud of that. We would do research on AI in different journals, articles, and so forth, just to become familiar with the concepts that are out there. So it's as simple as a Google search sometimes, but if you're looking for additional training, I mean, I could say the program was terrific from my experience.

Patricia Weth: Great, thank you so much.

Alina M. Semo: Okay. I think I saw Paul Chalmers next, from PBGC (Pension Benefit Guaranty Corporation). And then Jason, you raised your hand again or is that an old hand?

Jason R. Baron: Yes, I raised it again.

Alina M. Semo: Okay, so Paul's next. Go ahead, Paul. Oh, you're on mute.

Paul Chalmers: Sorry. Hi Eric, it's Paul Chalmers from the Pension Benefit Guaranty Corporation. I was wondering what kind of objections you ran into from your, you've referenced it a little bit, but objections you ran into from your enterprise architecture and cybersecurity control people. I know you run privacy over there so, hopefully, that wasn't an issue, but what objections did you run into and how did you overcome them?

Eric Stein: Sure. So I think we socialized it well ahead of time to understand what some of those concerns would be. And by doing so, we really didn't hit any of those speed bumps. I think we just launched into this saying we're going to do something. But since there's a benefit to the lag time between the course ending in May of '22, which is when I wanted to start right away, all motivated after the course, "Let's go do this [in] June." And with other competing priorities, we didn't get to start till October. That gave us a few months of lead time to talk June, July, August, September, about four months to socialize with key partners. What are things we should look for? What are the concerns? Also consulting with interagency partners, we're looking to do this, what do you think? And I was actually surprised there was a lot of support.

I guess this goes back to the first question, biggest lessons learned. There was a lot of support because I think people were interested in trying something different and it was just a new approach and maybe if it worked, it could be something that..so we didn't have those concerns raised. It's funny now that we're at the point about to release information, some of them are coming out, are we sure there's no statutory information there, privacy and so forth. And, of course, privacy is something we take very seriously and something you mentioned we're responsible for it here at state. We are making sure we put an additional check in place before we release anything just to make sure. Because anything else, a machine just like a human, can make a mistake and we want to do our best to hedge against those issues.

Paul Chalmers: Thank you very much.

Eric Stein: Actually, one more point on that. I'm not sure every agency or every group would have responded that exact same way. There are different concerns, sensitivities and so forth. So we maybe got lucky here too in terms of having this perspective, but there are very legitimate concerns that could have held up progress, and rightfully so, if certain circumstances occurred. So I think we were fortunate. And that's not to say as we review this next year for '99 or 2000 or so forth, we may now have to rethink, if not the whole thing, parts of this as well.

Alina M. Semo: Okay, thanks Eric. You get to take a breath. Jason, I'm going to call on you and then Gorka has another quick follow up question.

Jason R. Baron: It is Jason Baron at University of Maryland. I wanted to just respond to Stefanie's observations. I hope Stefanie and I hope everybody in the government community knocks on Eric's door. Eric, I'd hope that 300 components of government knock on your door, so be careful what you wish for and no good deed goes unpunished in coming here today. So that's the first thing. But the second thing is, Stefanie, that I would strongly recommend, and I assume it will be the practice of every federal agency to the extent they use AI tools through the e-discovery sector, commercial tools, if that's the way to go, that it won't be the vendors who will be doing the training. It would be in-house with your own people who are FOIA experts.

They can give you the software and the licensing and all of that, but you'll use your own people. You might use your contractors, and that's a separate question about controlling that and making sure that the training goes right, but it'd behoove every federal agency to use their own people. I also, again, would recommend as I did to Eric, I really think that RFIs are the way to go for each agency. There have been efforts to do RFIs in the past, but this is such an important area that I think every agency should be considering reaching out to the broader private sector to see what is possible. Thanks.

Eric Stein: Yeah, great points. As for all the calls and people knocking on doors, I'm always surprised how little feedback we get afterward. I may regret saying that now, but I think we tend to find that a lot of people don't follow up, which is disappointing. I mean, in the Technology Committee we do get feedback, we vet it, we share it, and we go through it at a minimum and it does inform our decision making. It could be, this is a great idea [or] that we don't think this is so great. Could you think about this instead? So in this specific context of this briefing today, we would welcome the thoughts and feedback and take it back to a team I meet every two weeks with David, and Gio, and our team that does all the work of this as well. So yeah, I think the other thing on RFIs, which are requests for information from public agencies to see what's out there.

We're interested in this type of tool or technology. Yes, those are great avenues to pursue. And I think one of the things we've seen in the Technology Committee is that some agencies just want help to even starting that process. Where do I go? What do you put into an RFI? We're thinking about this and how specific should we be or not? And one of the other things that come out of this pilot but also the FOIA pilot is we need better shared platforms between agencies. We're still using email way too much. I'm not talking about correspondence, I'm talking about to process FOIA requests and it's wildly inefficient. It takes too long. If we had better technology to help collaborate, in particular with referrals and consultations, or even internally at times, I think the FOIA process could be improved in many different places.

Alina M. Semo: I see Catrina's hand up too. Let me get to Gorka first for a quick follow up.

Gorka Garcia-Malene: Ok. Thank you, Alina. I hope I'm not asking you all to repeat yourselves. I just want to confirm, I see that you, Eric, David, and Gio took what you learned from the pilot and tried to incorporate it into FOIA, both on the customer experience side and also as it relates to document searching. Is it fair to say then that the pilot did not immediately reveal opportunities for machine learning in the realm of FOIA document review? Did I hear that correctly?

Eric Stein: It depends on how you define review, I think. Because in terms of applying redactions, no, we're not there. In terms of reviewing large volumes of information to help maybe narrow what's potentially responsive or not. Let's just talk about that process for a moment; eRecords is like a Google-like search across our unclassified and classified networks, emails, cables and so forth. So we have a large volume we can put in the terms from a requester and get two million hits and we can try to narrow it down the best we can, but we may have to go through manually each one of those records to figure out this could potentially be responsive and if it's the responsive, then we do the review. So I wouldn't say it was not successful in FOIA review because I think the search is a big time component and it's grown in terms of the time in a FOIA case, and that's grown. If that's grown, that also means the review time has to grow as well for the actual redaction and application of any exemptions and so forth. So one could argue that this actually helps the review process in that area. And I think Jason mentioned Mitre before, we looked at their tool. They came to brief us. They had some great, great possibilities in what they were doing at Mitre. The question becomes how does that work with a record set, or archive, or case processing tool and so forth. And it's going to vary by agency and it gets into the issues we talked about before. So I think there were major wins, potentially, for the FOIA and transparency community sorting, because the core issue at the start of this is how do we deal with this growing volume of requests and the growing volume of information data and records.

David Kirby: I would just add to that.

Eric Stein: Yeah, I was going to say, I'm sure you have a view on that, David.

David Kirby: Yeah, I would add that not just this effort, but other efforts we've done with the analytic team on other projects as well where we've relied on eRecords data, we've learned a lot from that, but it's actually had us incorporate additional metadata into our archive that wasn't there before. So things like we're now doing entity extraction where we're actually identifying key people, places, organizations, and things like that. And adding that to our metadata, which is helping not just analytic teams from their projects, but also our regular searches for FOIA because they can now filter and facet on those entities. We've now added a sentiment score to every record in the archives that we can tell the tone of the record is positive or negative, which can help with certain searches and discovery.

Eric mentioned when you do a search on anything that's in the news, you can get millions of hits because everybody's got a subscription for the Washington Post, and Google, and things like that coming to their email and you're going to get hits on all that. So now we automatically tag the top senders that exclusively send subscription type news emails, and so users and searchers can just filter those out with a single click. So things like that are the kind of lessons learned from these AI projects that have helped a lot in just the general kind of metadata tagging and searching of records.

Alina M. Semo: Great. I think we're getting ready to move into our break soon, but Catrina, I know you've very patiently had your hand up, so please go ahead.

Catrina Pavlik-Keenan: I just wanted to say, Eric, I already reached out to you and sent you an email about us getting together because I'm very interested in AI stuff and things like that right now. We know that that's the way things are going to have to advance in order for us to manage all the work that we're getting, doing all the FOIAs. And I wanted to say, anybody who hasn't, so I am very familiar with the class that Eric's talking about at the public partnerships. I actually am signed up for the next class that starts October 5 so they are accepting applications right now for that.

And in our case, and I don't know, Eric, if this is how you all got started doing this pilot program, but I believe that you started a project out in that class and that's part of what you do and I wanted to ask you, was this part of the project that you started in class? Because I know, for instance, for us, James Holzer is actually doing part of the AI in his class right now, which ends a week before I start, and so I'm going to carry over the project that he's starting for my project and see how much further we can take that in this program so that was the one question that I had.

Eric Stein: Yes, the answer is yes. One of the previous slides in the deck that's publicly available has my actual charter from the project. Like I said, that's why I joked about it's humbling to go back to your work, but it was also very exciting to be able to have a vision and to see it come through. And there was risk. We thought this might not work. It worked, at least in this instance that we have a couple of years where it's been successful, may it continue to be. But I think overall it's also sparked some energy and excitement around the ways we could use technology for records access to think about proactive disclosures in FOIA to help other agencies to collaborate and so forth. So I'll be on the lookout for your email and I'm excited for you for that course. I really enjoyed it

Catrina Pavlik-Keenan: And thank you for doing the presentation guys. It was great.

Alina M. Semo: Yep. All right. I don't see any other hands up. Going once, going twice. Just looking at everyone to make sure I didn't miss anyone. Okay. So Eric and David, thank you again for your time for being so patient in answering all the questions. I promised you there would not be crickets from the committee members and I delivered. And if any committee members have any other questions and you want to talk offline, I'm sure Eric and David can make themselves available. And with that, let's go ahead and take a, let's try a 10-minute break since we're running a little behind schedule. If we could get back here by 11 minutes, if we can get back here by 11:50, that would be great, and we'll resume with our subcommittee reports. Thank you so much.

[Break]

Michelle [producer]: Hi, welcome back ladies and gentlemen. We will now commence our meeting today and I will turn it back over to Ms. Alina Semo, Director of Office of Government and Information Services and Chair for FOIA Advisory Committee. Alina, please go ahead.

Alina M. Semo: Thank you, Michelle. Welcome back everyone. I'm just checking my screen to make sure that we have a quorum. I see a couple of people are still missing. Hopefully, they will join us momentarily. Still waiting, I believe for Bobby, Carmen, Jason Baron, Tom Susman, Allyson Deitrick. Hopefully, they're going to join us in a second, but thank you to the rest of you who've returned. I see Bobby now. Thank you. So, any other questions or comments about the presentation we just had from the State Department? It was really terrific. Maybe you just need a chance to absorb more of what you heard. Anyone have any comments or thoughts? Did you find it helpful or helpful and useful in any way to the work you're doing in your subcommittees? I see Lauren nodding. That's good. Thumbs up. Thank you. Okay, so without further ado, let's move on to the next part of our meeting. We're going to get subcommittee report outs.

I know the subcommittee members have been working very hard in each of their subcommittees, so I am very excited to have them present on the great work they've been doing. We're going to start this time around with the Resources Subcommittee. We'd like to shake things up every time and give each subcommittee a chance to start off first. So without further ado, I'm going to turn things over Gbemende Johnson and Paul Chalmers.

Gbemende Johnson: Thank you so much, Alina, and hopefully everyone can hear me okay. So. As we mentioned at the last FOIA meeting, the Resources Subcommittee was conducting interviews of high level FOIA officials and a survey of federal FOIA professionals that was initially launched at the ASAP (American Society of Access Professions) Conference in June. So we've completed both of those tasks. We will begin the process of aggregating and condensing responses from the interviews very shortly. But regarding the survey, we received approximately 150 complete responses. And if you recall, we were asking FOIA professionals about issues such as training, resources, and technology. And I don't have time to go over all of the responses here, but a few that stood out. Seventy-seven percent of responses noted that they felt they needed more resources to properly implement FOIA. When asked what they believed was the greater need in their office, 53% stated the need for more staff, 21% stated the need for more technology, and 16% the need for more training and the remaining responses were in the other category.

We were also interested in retention issues. So we asked if respondents had considered leaving their positions and 54% of respondents stated that they had for various reasons. For example, some people were looking to retire, but the top two reasons given were of those who said yes, were higher grade opportunities and a concern over a lack of needed resources to complete their tasks. So in regards to some of these responses, the Resource Subcommittee is exploring a number of recommendations that could, hopefully, provide practical solutions that could aid agencies in bringing on additional FOIA staff resources when needed. And I'm just going to touch on three points and let Paul go into more detail. So one recommendation that we're exploring is recommending that the GSA (General Services Administration) add FOIA contractor services to the GSA's Schedule to help agencies save time and money when hiring contractors, if an agency decides that they need to hire contractors. And we want to stress the if. If an agency wants to hire contractors, is there a way to speed up the process in a way that doesn't compromise the process?

Another recommendation involves modifying the career ladder for government information specialists. Also, something else that we're exploring is allowing the direct hiring of FOIA specialists through the excepted service rather than requiring full competitive hiring. I am going to let Paul go into more detail about these points.

Paul Chalmers: Thanks, Gbemende. I'm Paul Chalmers from the PBGC. I'm going to talk about the human resources ones first. So one of the main themes that came out of the interviews that we conducted, as well as the survey, was the frustration with hiring and retaining quality FOIA people. Once you hit a certain level in the government, if you're on what's called a career ladder job, there's no place else for you to go.

And with the FOIA, people in FOIA jobs tend to have a lower cap to the career ladder than other professions in the federal government. That leads to people deciding they've done enough in the federal government, or jumping to an agency that might have a one-off position at a higher grade or looking for other opportunities. So if you make the ladder a little higher, then you tend to promote retention. People stick around a little longer to take advantage of that higher grade. There's another group that was recommended by a prior term of this committee, the COCACI (Committee on Cross-Agency Collaboration and Innovation), who are also looking into this issue. And so we're coordinating quite closely with them with respect to this issue.

Just to give you an example, my agency has, the career ladder caps out at a 13. Well, we have a 14 at my agency that's not on the ladder. That we could look at along with fourteens that might exist at other agencies and put together some kind of a recommendation that says, "Let's make this part of the ladder so we can hang on to the good FOIA people."

The one on direct hiring, that's another source of frustration. It can take ages to fill positions when they're open, unless you can do the expedited or exempt hiring. The federal government has recently extended exempt hiring into areas such as IT specialists and cybersecurity because it's an important function. They want to make sure they're able to fill the ranks with qualified people quickly. Well, this is just as much of an essential function. We need to have that flexibility in order to fill our positions and make sure that we fulfill our obligations to the public.

The first one that Gbemende mentioned, and I'm going to tag in Stefanie Jewett if she wants to speak, is what's called the GSA Schedule. The GSA Schedule is literally a schedule of different goods and services and vendors that the Government Services Administration has pre-qualified. That agencies can simply come in and write a task order without doing a full procurement. Get contractors on board or goods or services, whatever they need, in a much more rapid fashion than going through a full procurement. If you are in a bind and you need to bring in contractors to help with some issue that you're having in your FOIA world, it would really speed up the process if you were simply able to write a task order against the GSA Schedule. Rather than having to draft a procurement package, put it on the street, do an evaluation, and potentially deal with a protest.

Stefanie, are you on? Did you want to add something to that?

Stefanie Jewett: Sure. Thank you, Paul. I just wanted to quickly say for this one, this would not be to replace full-time employees. I think we all can agree that full-time employees would be the preference. However, there are certain circumstances that government agencies have where they may have a limited small budget that they quickly have available that we could advocate to get temporary help.

Just some few situations that the group has talked about where this could be helpful. A small agency who has maybe only one or two employees and they suddenly get hit with a run of requests and they would only need someone for a very limited time. Agencies who cannot commit to an ongoing salary that potentially would be able to commit to a small amount. Another example is often that happens in the government, right? Like if a project falls through, if a system that they were going to acquire fell through, another situation where we potentially could deviate those resources and quickly get a contractor on board.

Like Paul was saying, this could save months in terms of agency resources and trying to get a contractor on. This would just be another option that would be available for government agencies to use. Again, I think that the important thing is it would not be to replace any type of full-time employees. But there's so many opportunities out there throughout the year where government money and resources become available. So this would be a great thing that they could look at and just use that money to quickly get somebody on to help with a big... Bring down the backlog really quickly for a limited time.

Thanks, Paul.

Paul Chalmers: Thank you. So these are three examples of the types of things we're looking at writing up over the next couple of months that would address the practical frustrations that federal FOIA offices are confronting on a day-to-day office and staffing and running their operations. We'll be looking at these and others. Hopefully we'll be able to bring some degree of assistance to the managers of these departments.

Gbemende, I'm going to turn it back to you unless there's questions for me specifically.

Gbemende Johnson: No, I think I'm good. Are there any questions? Thank you, Paul and Stefanie.

Alina M. Semo: I also ask any of those Resources Subcommittee members, anyone else want to chime in with any other thoughts? Okay. Guys, you're doing great work. Thank you so much. Really appreciate the report out.

Next, I'm going to turn to Implementation Subcommittee co-chairs Dave Cuillier and Catrina Pavlik-Keenan. Over to the two of you. I don't know who's speaking first.

Dave Cuillier: Well, I can make this quick I think. I'm Dave Cuillier. I'm director of the Brechner Freedom of Information Project at the University of Florida. The Implementation Subcommittee's been making progress, still working on examining how those 51 recommendations passed by the four previous terms have panned out.

We have a working group that's gleaning through Chief FOIA Officer Reports to assess progress on nine of those recommendations. Next month, we'll send out a survey to Chief FOIA Officers and interview some of them to also gauge progress on another dozen recommendations to see where things are.

We've started crafting our draft report. We hope by the December meeting we should have some preliminary conclusions to report. So hopefully we'll come back and give folks a sense of what we've seen so far. And then of course, thanks to everyone on the subcommittee for all their work and time and expertise. If anybody else would like to chime in or add anything or ask questions, feel free to do so.

Catrina, anything I missed there?

Catrina Pavlik-Keenan: Nope. Once we start getting the stuff together that we're going to do, Dave's going to hand over the part for me to do the interviews for the day's portion. So you'll be getting calls from me, some of you that will be interviewed. So I'll be taking over that part so you'll get to talk to me about everything that we want to know.

Alina M. Semo: Sounds great. Thank you to both of you, and thanks to the rest of the subcommittee members. Any other committee members have questions for Implementation? Going once, going twice. Looks like everyone just wants to go home early, and I respect that.

Okay, last but certainly not least, Modernization Subcommittee co-chairs Jason Baron and Gorka Garcia-Malene. Jason and Gorka, I'm going to turn it over to you.

Gorka Garcia-Malene: Thank you, Alina. I think I'm going to go first. Good afternoon, everyone. As Alina alluded to, my name is Gorka Garcia-Malene. I am the FOIA officer at the National Institutes of Health. Together with Jason Baron, I co-chair of this Advisory Committee's Modernization Subcommittee. Our subcommittee continues to meet every two weeks with working groups convening in between.

Since the June meeting, the subcommittee successfully collaborated with NARA and with DOJ (Department of Justice) to produce a memorandum that has been circulated to all Chief FOIA Officers. That was back on August 21st of this year, obviously. The purpose of the memorandum was threefold. The first is to remind Chief FOIA Officers of the August 2023 deadline for interoperability with foia.gov. The second is to remind Chief FOIA Officers that FOIAonline itself is being decommissioned at the end of the fiscal year, and to share some best practices as it relates to customer service.

And before moving on, I just want to thank both Jason Baron and Alex Howard, who are fellow Advisory Committee members, for delivering the lion's share of our contribution to this important memorandum. So thank you Alex. Thank you, Jason.

On a separate front, we continue to work on developing a model determination letter for our collective consideration and comments. Adam Marshall, our fellow Advisory Committee member from Reporters Committee for Freedom of the Press, is spearheading that effort. I'd like to share the floor with Adam for his thoughts on the progress of his work. Adam, you have the floor.

Adam Marshall: Thanks, Gorka. At the last Advisory Committee meeting, we had noted that this was a project that we were working on but wanted to solicit input from the broader FOIA community, from members of the public, from federal agencies. And so we embarked on a process to solicit that input.

I'm very glad that we did. I'll say quite candidly that we got more comments and more engagement than I thought that we were going to receive. We received comments from members of the public, from civil society organizations, and from federal agencies. We were very excited about that, and we've been digesting, I would say, those comments. Some of them were broader and more general, and then some of them were very specific and quite technical in nature. So we have been reviewing and discussing those in our biweekly meetings, as Gorka said.

I am quite confident that they've already made the draft that we've been working on, a better work product. We are continuing to work on them with the idea that we will have something for the whole committee to look at in the future. And so thanks to everyone who submitted comments, and to the subcommittee members for all of the engagement on that project.

Gorka Garcia-Malene: Thank you, Adam. Jason Baron, our co-chair of the subcommittee, is also here. Jason, would you like to share your thoughts on the progress of our efforts?

Alina M. Semo: Oh. Jason, you're on mute.

Jason R. Baron: Can you hear me now?

Alina M. Semo: Yes

Jason R. Baron: Sorry. So I want to echo what Gorka said, that appreciate Alex Howard's efforts in spurring on the idea that we should have some engagement with the wider federal community on foia.gov and the sunsetting of FOIAonline. I really appreciate Alina and Bobby for your efforts in doing really an excellent memorandum on that.

And among the activities, discussions, that our subcommittee is having is whether there should be some follow-up by the Advisory Committee to see how agencies have implemented the goals of OMB, and what Alina and Bobby set out in the memorandum in terms of preservation of FOIA responses in a transitional period to the new platforms, and just in general compliance. So we will be having that conversation.

We also are engaged with in talking about how agencies might early on in the process have a dialogue with requesters. Especially about issues that really tie to what Eric Stein and others earlier in this meeting you're talking about. The volume of records is tremendous. We see the wave coming, especially in light of the 2024 mandate from OMB and NARA for the entire government to transition to electronic record keeping and ultimately to accessioning permanent records at NARA. So there's a tremendous FOIA issue, a looming FOIA issue ahead. The question is how in the early stages of a FOIA request agencies and interested requesters can engage in a dialogue.

And so we've had those discussions. We'll, I hope, come up with one or more recommendations on that subject and continue to talk about modernization in general. I think that's it.

Gorka Garcia-Malene: Thank you, Jason. I guess I'd like to know, do any of our fellow subcommittee members have any additional comments? Any questions from the rest of the Advisory Committee?

Alina M. Semo: I have a question for Adam just for clarity. I just want to make sure I understand where you are at with the model letter. You're working on digesting and incorporating the comments, and then you'll be circulating another draft. Are you planning to present it to the committee at our December meeting for a vote or is that premature?

Jason R. Baron: Well, let me answer on behalf of the subcommittee. Alina, it's Jason Baron. I believe that the process will continue. Whether it's in December or whether it's incorporated into a final report or a further report in the new year, I can't say. I don't think we can commit our subcommittee at this time.

We want to do an excellent job incorporating as many public comments as we can and explaining what we have done. And also very importantly, having a further dialogue with Bobby and others at the Department of Justice. Because ultimately this determination letter, in my view, I'm speaking only for myself here, really needs buy-in from you and from Bobby to make sure that it will be taken seriously and work with and adopted by the federal community at large.

Alina M. Semo: Okay. Thank you for that clarification.

Gorka Garcia-Malene: But I also want to add that we are dedicating quite a bit of energy to getting this to you all in good form as soon as we can. And of course Adam is doing most of the work, but everybody is involved. We look forward to bringing this to you all as soon as we can.

Alina M. Semo: Great.

Gorka Garcia-Malene: Alina, thank you for the opportunity to report out on the subcommittee's progress. These are just a few of the projects that we're working on. I think we all remain very excited about the work that we're delivering on behalf of the requester community, so thank you. That is our update.

Alina M. Semo: Thank you. Okay. Any other questions before we move on to the last part of our agenda today? A few of you have been very quiet today, which is uncharacteristic, as in Tom Susman. But that's okay. Tom always has something to say. You're saving it up for the next meeting. Right, Tom?

Okay, so not seeing anyone else eager to comment. We have now reached the public comments part of our committee meeting. We look forward to hearing from any of our non committee participants who have ideas or comments to share, particularly about the topics that we discussed today. All oral comments are captured in the transcript of the meeting, which we will post as soon as it is available. Oral comments are also captured in the NARA YouTube recording and are available on the NARA YouTube channel.

Just a reminder, public comments are limited to three minutes per person. Before we open up our telephone lines, I'd like to turn over things to Kirsten, our DFO. Kirsten, I'd like to check in with you first. Let us know if we have received any relevant questions or brief comments via Webex chat during the course of our meeting.

Kirsten Mitchell: Hi, Alina, this is Kirsten Mitchell, the Designated Federal Officer. We have a couple of questions which I'll briefly read and hopefully try to answer one question, why do OGIS and OIP (Office of Information Policy) disable the YouTube chat function, quote, depriving citizens from participating?

First of all, I cannot speak for the Department of Justice. Second of all, I'll say that we at the National Archives very much value citizen participation. Any member of the public is permitted to file a written statement with the committee in accordance with federal regulations governing all federal advisory committees. That's in accordance with the Federal Advisory Committee Act.

I'll also note, and we're in this period now, any member of the public may speak or otherwise address the committee as the agency guidelines permit. Obviously here at the National Archives, those guidelines do permit since one of our strategic goals is to make access happen.

There is another question about funding levels needed by OIP and OGIS to execute their missions and develop employees. Once again, I cannot speak for the Department of Justice. But I am pasting in the chat for everyone the National Archives FY24 budget justification. That should answer some questions. That is all I see.

Alina M. Semo: Okay.

Kirsten Mitchell: Back over to you, Alina.

Alina M. Semo: Great. Thank you so much, Kirsten. Bobby, I just want to give you the opportunity to answer or respond to any of those inquiries if you want to.

Bobby Talebian: Yeah, thank you. I appreciate that. I certainly do. And we really much value public participation and engagement in all of our public events. Similarly, provide different opportunities for the public to engage, like public commenting periods that we do in the CFO Council meetings. And so that's very important to us as well.

As far as budget and funding, I can tell you, all organizations I think you can go to one that says they could use more resources, but the department is very invested in the mission of OIP. I don't have our budget handy, but I can tell you that we're well-supported by the department. As you know, the attorney general issued FOIA guidelines that support the mission of FOIA government-wide. That's further showing the support that I get for the delivery [of] our mission.

Alina M. Semo: Okay. Thank you so much, Bobby. Really appreciate that. Michelle, may I please turn to you now and ask you to just provide instructions to any of our listeners for how to make a comment via telephone?

Michelle [producer]: Absolutely. So ladies and gentlemen, as we enter the public comments session, please limit your comments to three minutes. Once your three minutes expires, we will mute your line and move on to the next commenter. Once again, each individual will be limited to three minutes each.

Alina M. Semo: Michelle, do we have any callers in queue?

Michelle [producer]: Let me take a quick look. So far, let's see, I do not see...I'm looking. I don't see anybody in queue. As a reminder ladies and gentlemen, if you are logged into today's session via Webex audio, please click the raise hand icon, which is located in the lower toolbar. This will enter you into the comment queue. If you are dialed in today via phone only audio, please click pound two. That will raise your hand as well.

Alina M. Semo: Okay. While we're waiting for anyone else out there to be queued up, Kirsten is indicating to me that she has one other item she wants to bring up. So Kirsten, back over to you.

Kirsten Mitchell: Sure. This is Kirsten Mitchell, the Designated Federal Officer. There was another comment regarding minutes of these meetings. When I say these meetings, the FOIA Advisory Committee and the Chief FOIA Officers Council, those are two separate bodies. I just want to put on the record that FOIA Advisory Committee minutes are governed by the Federal Advisory Committee Act. Chief FOIA Officer Council minutes are governed by the Freedom of Information Act.

FOIA requires that CFO Council minutes contain a record of the person's presence. The Federal Advisory Committee Act does not have that requirement. Thank you. Back over to you, Alina.

Alina M. Semo: Thank you, Kirsten. Appreciate that clarification. Michelle, anyone waiting to speak on our telephone lines?

Michelle [producer]: We do have a caller in the queue. Caller, go ahead. Your line is unmuted. You have three minutes.

Bob Hammond: Yes, this is Bob Hammond. I have submitted many public comments with thoughtful recommendations to OGIS, O-G-I-S, and DOJ OIP, but they refuse to post them. Instead, now unnecessarily requiring a character limited, text-only document that limits content and diminishes the impact of my extraordinary accessibility screen PDF (Portable Document Format) presentations and those of others.

The number of written public comments is minuscule compared to the thousands of PDFs that NARA and DOJ posts. It's not about ADA (Americans with Disabilities Act) accessibility. NARA and DOJ OIP disfavor the content and the powerful presentations. Then NARA and DOJ OIP now disable the chat function in YouTube, depriving citizens of the opportunity to contemporaneously participate in open meeting discussions, which are then later viewed by thousands. This is wrong. FOIA Advisory Committee, please change your bylaws to incorporate this and consider my other recommended changes.

Additionally, Ms. Semo's statement that comments in the chat window will not be recorded in the transcripts appears to be a violation of the FACA and other laws. And if OGIS destroyed them, as they claim in response to my FOIA request, that may violate multiple laws. NARA's Unauthorized Records Distribution Unit and OIGs are reviewing this.

I've been advocating for increased funding for OGIS and DOJ OIP for years, but OGIS and DOJ OIP do not advocate for themselves and NARA and DOJ refuse to seek adequate funding. The FOIA Advisory Committee has considered recommending moving OGIS under GAO (Government Accountability Office) with direct funding from Congress. My new idea is to transfer the currently poorly executed OGIS and DOJ OIP FOIA compliance and audit functions to GAO, which is a great fit for GAO, while OGIS and DOJ OIP retain their current funding. This would immediately double Ms. Semo's mediation staff with funding for increased training, professional certifications, increased grades, and professional opportunities.

OGIS mediation responsibilities conflict with court compliance, mediation, and NARA cases, while DOJ OIP has severe conflicts of interest in acting as the appellate authority for DOJ and defending agencies in court.

Next, every agency budget should have a top-line budget item for records management and FOIA, justified by how dismal performance and employee professional development retention are without the funding. This would be a game changer for beleaguered, overwhelmed FOIA staffs and a gift to our nation.

Ms. Wall, Ms. Shogan, NARA is not transparent in FOIA. They violate laws, regulations, and policies. See my written public comment presentations. Thank you.

Michelle [producer]: Thank you for your comments, sir. All right. I do not see any additional commenters in the queue at this time.

Alina M. Semo: Okay. Thank you very much, Michelle. So I think we're able to give back committee members the gift of time, which I'm thrilled to do. I want to thank all the committee members for the continued hard work that everyone is engaged in. I want to also thank, again, our State Department colleagues for their presentation today. Look forward to seeing everyone virtually in this space at our next meeting Thursday, December 7th. Again, we're in sevens. We're going to begin at 10:00 AM.

I want to thank all of you for joining us today. Hope everyone and their families remain safe, healthy, and resilient. I want to ask our committee members if there are any other questions or comments before we adjourn.

I don't see any hands up, so I am happy to be able to give you six minutes back plus 30 minutes so that's 36 minutes. Without further ado, we stand adjourned. Thank you. Take care, everyone.

Michelle [producer]: That concludes our conference. Thank you for using event services. You may now disconnect.

Top