This week’s Pipeliners Podcast episode features Jackie Smith, president of PODS, discussing the Pipeline Open Data Standard with host Russel Treat.
In this episode, you will learn about the PODS Data Model, the role of the PODS association supporting data integrity, how to combine pipeline integrity data with location data and mapping to perform risk analysis, and how to simplify and streamline risk analysis using optimized data.
You will also learn more about the background of Jackie Smith and what led her to become involved with PODS. Also, listen to this week’s episode to learn more about how you can become involved with the association!
PODS: Show Notes, Links, and Insider Terms
- Jackie Smith is the president of PODS and a GIS IT Architect for Williams. Connect with Jackie on LinkedIn.
- PODS (Pipeline Open Data Standard) supports the growing and changing needs of the pipeline industry through ongoing development, maintenance and advancement of the Data Model and Standards. PODS also serves as a member association to maintain the PODS Data Model.
- The Pipeline Open Standard is a database schema (architecture) for pipelines. It functions by creating populated database information relevant to the life-cycle of a pipeline.
- PODS 7 is the latest version of the Pipeline Open Data Standard, released in May 2019. [Members-only access]
- The PODS Association is a not-for-profit industry standards association that develops and maintains the PODS Data Model — the pipeline data storage and interchange standard for the oil & gas industry.
- The PODS Conference is a networking conference for members and non-members. This year’s event is September 30 and October 1 in Houston, Texas. [Register Today]
- GIS (Geographic Information System) is a system designed to capture, store, manipulate, analyze, manage, and present spatial or geographic data.
- Esri (Environmental Systems Research Institute) is the largest GIS software supplier in the world. Esri holds an annual International User’s Conference in California.
- Data Warehousing, also known as an enterprise data warehouse, is a system used for reporting and data analysis. It is considered a core component of business intelligence.
- Williams is an energy infrastructure company based in Tulsa, Oklahoma. Its core business is natural gas processing and transportation, with additional petroleum and electricity generation assets spread across the globe.
- Oracle is a relational database management system that provides an open, comprehensive, and integrated approach to information management.
- SQL is a relational database management system developed by Microsoft.
- PHMSA (Pipeline and Hazardous Materials Safety Administration) is a United States Department of Transportation agency created in 2004. PHMSA is responsible for developing and enforcing regulations for the safe, reliable, and environmentally sound operation of the US’ 2.6 million mile pipeline transportation.
- API (American Petroleum Institute) is the only national trade association that represents all aspects of America’s oil and natural gas industry.
- API 1178 (Integrity Data Management and Integration) provides guidelines for how to integrate the underlying data used to support integrity management.
- Integrity Management (Pipeline Integrity Management) is a systematic approach to operate and manage pipelines in a safe manner that complies with PHMSA regulations.
- NACE (National Association of Corrosion Engineers) is an international organization dedicated to protecting people, assets, and the environment from the adverse effects of corrosion.
- The External Corrosion Direct Assessment (ECDA) is a joint program between PODS and NACE that enables electronic integration of data and standard reporting within the pipeline industry to allow transfer between different software packages or computer systems.
- ISA (International Society of Automation, formerly known as The Instrumentation, Systems, and Automation Society), is a non-profit technical society for engineers, technicians, businesspeople, educators and students, who work, study or are interested in automation and pursuits related to it, such as instrumentation.
PODS: Full Episode Transcript
Russel Treat: Welcome to the Pipeliners Podcast, episode 89, sponsored by Gas Certification Institute, providing training and standard operating procedures for custody transfer measurement professionals. Find out more about GCI at gascertification.com.
Announcer: The Pipeliners Podcast, where professionals, Bubba geeks, and industry insiders share their knowledge and experience about technology, projects, and pipeline operations. Now your host, Russel Treat.
Russel: Thanks for listening to the Pipeliners Podcast. I appreciate you taking the time, and to show that appreciation, we are giving away a customized YETI tumbler to one listener each episode. This week our winner is Dan Geers with Duke Energy. To learn how you can win this signature prize pack, stick around until the end of the episode.
Well, this week, we’re very fortunate to have with us Jackie Smith. Jackie is the president of PODS, which is the volunteer organization that supports, maintains, and promotes the Pipeline Open Data Standard. She’s here to introduce to us what is PODS. Jackie, welcome to the Pipeliners Podcast.
Jackie Smith: Thank you, happy to be here.
Russel: I’m glad to have you. Actually, this is one of those things that I’ve known about for a very long time, and I’ve always wanted to learn more, so I’ve got an opportunity to get educated. Maybe a good way to start here is, why don’t you tell the listeners a little bit about your background, how you got into pipelining, and how you got involved with PODS.
Jackie: Sure. I started in the industry in the late ’90s with a large operator, pipeline operator, transmission, and I worked here for several years. I went to local government, did a stint there, and then I decided I wanted to come back to pipelines because I loved it so much.
I came back right before the drop of oil price in 2014, but I was able to stay on and been here ever since. I am involved in everything related to the database around GIS, geographic information systems. I’ve been doing that since, like I said, the late ’90s.
Russel: What did you do in public service?
Jackie: I went into public service, and I did GIS there, too. I worked for the local appraisal district, Harris County Appraisal District, which was an assessor’s office here in Texas. I did their data model and their parcels and helped get that in sync and upgraded.
I went on to the City of Houston. I took on their GIS. I was an [executive assistant director] there for several years.
Jackie: Got into data warehousing, and business intelligence. Then I got involved here with Williams.
Russel: How did you get involved with PODS?
Jackie: PODS, I was actually involved early on. There was an Esri Conference in ’98 where I was a part of…I was a newbie at Williams. I worked for Williams.
The data model was actually starting to come together. They needed something to store the pipeline data. They needed a way to reference assets and events, like a crossing, for example. A river crossing on the pipeline.
I got involved there when one of the gentlemen asked my opinion on the data model. He was working for a service provider at the time that was working with our company. I started getting involved in that and implementing the actual data model, which I can describe here in a sec, at Williams and helping enhance it and grow it.
Russel: Note to listeners — when somebody in a volunteer organization asks for your opinion, be very careful because you could end up getting volunteered.
Jackie: Yes. You’re young. You’re ready to work. You’re not going to say no.
Russel: I’m the same way. I do that all the time. I think that’s awesome. What is PODS?
Jackie: PODS, the name actually came together. It actually stands for Pipeline Open Data Standard. The pipeline is obvious. The open — it doesn’t mean free, but it is open to members. It’s non-proprietary. It’s open in that it’s open collaboration. It’s developed by a group of industry partners.
There is a free model called PODS Lite that is available for free that you can download on the pods.org website.
The big words here are data standard. Data, it’s a data model. A model defines and stores data standards.
You think of a data model as a data architecture diagram. It has representation of the data. It’s a standard format. It includes rules for the data, such as a valve can only have so many diameters. You don’t want to put the wrong diameter in the pipe.
It has relationships for the data. It will store this valve belongs to this pipe, for example. It’s a way to communicate to your IT. Your IT actually develops it as a way to communicate to your business and to store that data in any relational database management system you want, whether it be a file based or an Oracle or SQL server.
Russel: We are using a lot of words that some of the listeners might not be familiar with. Let’s unpack a little bit relational database. What is a relational database?
Jackie: A relational database is pretty much what we all use since the mainframe. The mainframe was a hierarchical database and it really wasn’t a database. What we use now, which we’ve been using since, probably, the ’90s, is what’s called a relational database.
That stores your data components and tables, columns, and rows. Those columns and rows represent data. Those are stored in a format that has what you call relationships to each other.
Like I said, a valve has a relationship to a pipeline. You’re able to associate it together in the database and bring it into a report, for example. It’s a more efficient way to store data and query data. You can write applications on it. I can go into some of that.
Like I said, I take that for granted. It’s the base for all your data in all of our systems. They’re all stored in some type of relational database.
Russel: I think it’s a hard time, sometimes, to understand some of this technical jargon and what is a relational database. For those of us that work with technology, we’re very clear of what that is.
I think the way I think about it is, kind of a simple model most people would understand is an Excel spreadsheet is a database. It’s got rows and columns, right. If I do confirm data entry for one of those columns where I’m picking that information up from someplace else, that’s relational.
That’s a rough idea, but it’s basically…It kind of gets you there.
Jackie: I would respectfully disagree, just because when I think of database, Excel is not a database. It’s a quasi database, but a database add the relational integrity of your data where you can have roles. You can have a storage that’s backed up, whereas an Excel file, it could be backed up as a file but there’s no recall.
Technical stuff that goes on behind the scenes.
Russel: All the things necessary to give the data high-quality, high-integrity, high-availability, and make it an enterprise solution.
Jackie: Yes. Many people can edit it. You have Security Database 101. In theory, yes.
Excel is a way to think about the format of your data. You have columns and rows. You may have a relationship to another piece of data that’s tied with what’s called a foreign key/primary key relationship. That’s the gist of relating your data.
You don’t have all your data just in one huge table, which is like a mainframe was. You had all the data in one big clump. Then, you had to program your…
Russel: I’m old enough to remember that, Jackie.
Jackie: I’m sure. We still have one at the appraisal district. They may still even have it.
Russel: Oh my God. That’s just painful.
I think we went down a rabbit hole a little bit, but that’s okay. Who is the membership? How is PODS organized?
Jackie: PODS is a voluntary organization. We do have paid contractors. Our executive director and we have some technical staff that are contracted with. For the most part, it is a volunteer organization.
Members pay to be in the non-profit industry organization. We have roughly over 140 members. That varies from year to year. That’s made up of about half of operators, so pipeline operators. You have large mileage operators and we also offer a membership for small operators that’s discount that’s under a thousand miles. Then we have organizations, of course, such as PHMSA.
We have many service providers or vendors that are also part of the organization and that pay a member fee. It’s minor compared to the value of what you would get with the data model and using the data model.
Russel: I think this is one thing most people would understand about a data model, and particularly an open data model, meaning anybody can use it. Any of us can sit down and design a relational database, but if any two groups of people do that and then they try to share the data between one another, it can get really, really hard, really ugly, really quick.
Nobody is going to…When you get through all the details of how to organize data, nobody’s going to do it the same way. There will be enough distinctions that it will cause you lots of grief.
Jackie: Yes, you’ve just hit on the main problem in many organizations and even in sharing data with the government. We don’t all store it the same way. We don’t have a way to share it or program a way to share it, which is what we call a data exchange specification, is also part of our data model.
The data model is the critical piece. You have to have that. You have to all store the data in a similar, if not exactly the same, standard that’s agreed upon by the industry. Of course, that’s another reason why it’s open, meaning that we have open collaboration. There is a way that you can say, “I don’t like this. I’m going to change this.”
There’s a process that you can go through to be involved. Then, of course, the second part was this data exchange specification, which is an interchange, a way to shake hands between databases that we are developing. That’s new for PODS 7.
Russel: That’s actually a great tee-up because that’s where I was going to go next. Having been around since the late ’90s and given how much technology has evolved and how much the data that we’re able to capture has evolved, I’m curious. How much has this data model evolved over that period of time?
Jackie: It’s actually not that much, actually. We’ve been around for 20 years. The initial evolution of the pipeline data model was very relational only. It started out as a relational data model. It’s been a relational data model of pipeline events on a linear reference system. I can go into more detail about that.
At PODS, we recognize that it needed to grow and needed to become more flexible and more usable by various industry groups. It needed to be available in a spatial component as well as relational but it’s all generated from one model.
Now we have a geospatial component. We still have the relational aspect of the model that is available. That may be Greek to some people, but it’s a lot of flexibility and robustness in this new data model for PODS. It includes many things about your pipe. We want to continue to grow that.
Russel: I’ll try and break it down this way. You might correct me because I’m probably going to way oversimplify this. I think about it this way. If you think about a linear model for how the data is organized, it’s kind of like by milepost.
As I go down the pipeline by a milepost, I find different features about the pipeline. Here’s a well. Here’s a valve. Here’s an elbow, that kind of thing.
That’s a linear model. A relational model’s the next generation of that or the next level of complexity, if you will, where the data is not just linear.
It’s like, “Okay, here’s a valve. Oh, and then here’s all the attributes about the valve. That’s a relational characteristic. Linear, there’s a valve here. Relational, here’s all the data about the valve.”
Geospatial is, yes, but all this information’s in three dimensions. It’s north, south, east, west, up, and down. That captures all of that.
Jackie: Right. The linear part that you mentioned, and the relational part that you mentioned, is all relational. The linear segmentation, which you mentioned when you were referencing…I know a valve is on this part of the pipe. If I change this pipe out, I know that it’s on this new pipe because of that relationship that it had before.
The geospatial context comes in. Now we’re saying it’s georeferenced. It has a location on this planet. It has an XY. This pipeline is in this particular part of the country. It’s on this projection. That’s the geospatial component.
This valve actually has a coordinate now. It actually has a location. That’s what the geospatial, the GIS context, gives it.
Then you can add a third dimension of height, or 3D, but right now we’re mostly focused on the XY and that relationship of where those assets are in that line.
That relationship of class location. For example, where I go through high-consequence areas, I need to know exactly where those safety issues I need to pay attention to. That’s a linear event, if you will. Now, it’s another part of the pipe. It’s a line, if you will. A line. GIS is points, lines, and polygons. That would be a line.
Russel: Points. lines, and polygons. It sounds like trigonometry.
Jackie: Well, the basis of geometry.
I’m kind of curious. The things that you’re talking about are mostly about the assets. PODS, would it be safe to say or fair to say that PODS grew up as an asset-centric data model?
Jackie: I think that we’ve tried to call it more of an asset-centric data model because that’s what it is. It actually started as a way to define the reference of events along the pipelines for pipeline safety and pipeline integrity.
The events that happened in our country, the incidents that happened. You need to know where your pipeline is in regard to the Earth. You need to know where all these high-pressure pipes. That’s how it started.
I don’t want to say it’s just only assets because it’s the pipeline industry. Of course we care about our infrastructure, but we care about the product and the commodity of what we’re sending and how that affects the markets and such.
It started as an asset, but a location based asset primarily.
Russel: It’s a combination of here’s what assets I have. Here’s where they are on the map.
Jackie: Right, and this is the pressure. These are all the manufacturer details. The specs, the pressures, the mill report, the record. We need to make sure these are correct, those are traceable, verifiable, and correct.
That’s a term that we use in the industry to make sure that that data is traceable. If something did happen, we know exactly what caused it or how to prevent things.
Russel: Are there any API standards that govern all of this?
Jackie: No, there are none that govern it but there are some. There’s one called an integrity management system that does call to using a PODS data model, I believe it’s 1178. It does reference using PODS in that area.
We do have a data interchange specification with NACE on external corrosion data. External Corrosion Direct Assessment — that is also on our website.
Russel: I think there’s probably a huge opportunity in that domain. I know that the nature of what you can do with multispectral and 3D imaging around ground movement, and leak detection, and other things. Certainly, I think there’s lots of opportunity for PODS to simplify and address lots of other data types that are beginning to emerge and get immense, if you will.
Jackie: Yes, that’s a good point. The industry’s growing. There’s lots of places where we need data standards. We’re a volunteer organization. We have the basics and we have what we think is important. For example, our next model to build out in our PODS 7 is a regulatory model, which covers some of the data that you would need to be PHMSA compliant.
Russel: Oh, that’s fascinating. I want to know more about that. Maybe we’ll have another conversation and get even more detailed.
What would you say is the benefit of the standard?
Jackie: The benefits of the standard are pretty numerous. The cost, this cost savings you’re going to get from having something that you can reuse, is phenomenal. If you have a standard, you can write your applications against it.
Think of a standard like if you’re an engineer. Of course you’re going to use a standard to do something. That’s how I sell it, if you will, to people. It’s a standard for your data. You’re not going to want to use something that’s not a standard.
It’s going to increase your integrity of your data because you can put certain rules on your data to help reduce errors, such as: of category, of list, if you will, for when you enter a valve, like I said earlier, diameters or manufacturers instead of just letting somebody key in anything. They’ll have to key in from a specific code list that’s built for them.
That’s a basic one, but that lets you know that the standard is so important. Why reinvent the wheel every time you create something? You’re going to want to reuse it.
Russel: This is a huge conversation for me in the automation world because there’s no standard in automation for how you name all the points. ISA has some things, but there’s not anything that’s well understood and fairly well adopted in the pipeline space.
It means that migration and connection and management of that is much more complicated, particularly as things age and mature. It’s so much easier if you’ve got a standard and you’re building everything with a standard.
Jackie: We do not have standard naming or standard code lists. We have thought about doing that in the past. We do provide a template for you to start with, but those are always extendable and customizable.
The standard comes in that it’s a standard data type or it’s a standard field length. An integer, not a decimal. Those type of things.
If you’re all using a similar standard like that, it’s so much easier for people to program against it and integrate with your system. If you want to sell your assets or your pipeline or you want to buy another company that has their data in that PODS standard, it’s that much quicker, easier, cheaper to integrate.
Just all the way around, good things.
Russel: No doubt. So very true, so very true.
I could get on my soapbox and preach about that right there. It’s a big deal. To the extent you…There’s structure and then there’s content. To the extent that you can get both of those put together in a way that they, at their core, do the job but can be tweaked to deal with company specifics. It’s so much more effective.
If I were wanting to know more about PODS, how would I learn about PODS?
Jackie: You can always go onto our website, pods.org. Not to be confused with pods.com. Totally different organization.
We have gotten emails from people about pods, the storage containers. We are not them. I’m sorry, that’s just a side story that we have about that.
You would go to that website, pods.org. We have webinars posted on the right hand side. We have our latest news right down the center.
We have a PODS Conference coming up in the fall, September 30 and October 1 that is free for all PODS members. It’s very inexpensive if you’re not a member.
Our first day, we will have some training. We’ve never really offered training as a PODS organization. Our service providers have always offered it, of course, and our software vendors, but as a PODS organization, this is pretty much the first time we’ve offered it.
We have a technical track and a non technical track. Look for details on that. The non technical is just real high level.
What is PODS? What is this logical model? What is this physical model? What are these concepts? What is linear referencing? What is the GIS piece? Just some basics there.
The technical track would be more high level, opening the package of the PODS 7 download, and getting your hands a little more dirty, and implementing it, and finding out more about that.
Russel: The conference, what are the dates? Where is it going to be?
Jackie: That’s September 30 and October 1. It’s downtown [at the Embassy Suites]. More details are at PODS.org. It’s a half day on September 30, starting around noon. Then, all day October 1. After that is the GIS GIDA Conference, which is also in Downtown Houston. If you’re coming to Houston, you can have a whole week where you’re really embedded in the GIS world.
Russel: People are going to think I’m weird, but that sounds like fun to me.
Jackie: Yeah, no doubt.
Russel: Just a little bit about my history. I did an assessment of relational database technologies in 1988 / ’89 time frame, trying to figure out which of all the 15 or so competing technologies was going to end up grabbing market share. That was at the time that databases were moving off of mainframes and into minicomputers and such.
The whole relational database was a new idea. I know GIS is not new, but it seems to me like more and more and more of the applications are moving to this GIS approach versus the more classic relational approach.
Jackie: I think so. You can do a lot more when you have your data in a GIS format. There’s a lot of plug and play things.
The software of choice, Esri products, are obviously that’s GIS software. It works with all the GIS data, right out of the box. There’s so much you can do when you have your data in that format.
A lot of the reporting and the work that we do in safety and integrity is with GIS, ingrained in GIS software and capability. You can build the integrity and do a lot of the reporting with the relational model and linear referencing. That is being currently done in a lot of the larger companies.
The geographic way is going to streamline, I believe. Of course, we have the newer technology that is out there that already kind of exists in the database object world and the big data arena where the data is in a file, going back to file-based type systems but on a larger scale.
Russel: I think the ability to combine pipeline integrity data with location data and mapping data and the environmental data allows you to do risk analysis in a way that you — it simplifies and streamlines some of the risk analysis that you might want to do.
We’re all talking about data analytics these days. One of the prerequisites to do a good job of data analytics is to have good data.
Jackie: Exactly. You have to have that strong foundation before you can do any of that fancy stuff that you hear. Predictive analytics and data analytics. The new one, of course, is AI, artificial intelligence. Everyone’s wanting to promote that.
You have to have good, sound data that you can depend on and that’s correct or you’re going to get garbage in, garbage out type situation.
Jackie: The data standard helps that enforce that, that quality data.
Russel: Jackie, I certainly appreciate you coming on board and sharing this information about PODS. I know that we skipped a stone across the lake. We didn’t really talk about much of anything in great detail.
Certainly, just from my own standpoint, there’s a number of conversations about learning more about the details of the data model and how all that works that I’d be interested in doing. I’d love to have you back to dig into stuff that’s even more geeky than what we’ve been talking about today.
Jackie: Yes, yes. I rely on many volunteers, the PODS organization. Of course, I don’t know if I mentioned it. I am the president of the PODS organization. It’s a volunteer position. We have a fantastic board, fantastic volunteers on our technical committees. They’ve been doing this a lot longer than I have.
A lot of the content that’s out there is all because of the hard work that they’ve done in addition to their day jobs. A lot of the things that I’m able to articulate are because of the relationships I’ve built with these people.
A great organization.
Russel: That’s one of the great things about our business, too, is there are lot of people like that that are volunteering their time. They’re trying to improve the industry’s ability to operate and do those things safely and more efficiently.
I get a great deal of joy out of having these conversations with people like yourself that are taking your time and donating it, if you will, to make things better. I think it’s awesome.
Thanks for coming on.
Jackie: You’re welcome.
Russel: We’re going to have to talk to you and get you on about some more specifics around PODS and how I might use it and apply it.
Jackie: Okay, sounds great. Thanks.
Russel: I hope you enjoyed this week’s episode of the Pipeliners Podcast and our conversation with Jackie Smith.
Just a reminder before you go, you should register to win our customized Pipeliners Podcast YETI tumbler. Simply visit pipelinepodcastnetwork.com/win to enter yourself in the drawing.
If you would like to support this podcast, you can do that by leaving us a review or a comment. You can do that on iTunes/Apple Podcasts, Google Play, or whatever smart device podcast app you use to listen to the Pipeliners Podcast. You can find instructions at pipelinepodcastnetwork.com.
Thanks for listening. I’ll talk to you next week.
Transcription by CastingWords