This week’s Pipeliners Podcast episode features Sheila Howard of P.I. Confluence discussing the key aspects of data, dashboards, and pipeline risk management to support an effective safety management program in pipeline operations.
In this episode, you will learn about the role of organizational failure in high-profile pipeline incidents. You will also learn how to use the PDCA cycle to address issues such as human error, how to support proper data collection, and how to identify and address other risk factors. Sheila and Russel also share tips for pipeliners regarding safety and risk management and discuss the value of having the right tools for the job.
Pipeline Risk Management: Show Notes, Links, and Insider Terms
- Sheila Howard is the VP & Software Solutions Manager for P.I. Confluence. Connect with Sheila on LinkedIn.
- P.I. Confluence (PIC) provides software and implementation expertise for pipeline program governance that is applied to operations, Pipeline SMS, and compliance. PIC leverages process management software to connect program to implementation.
- “Nightmare Pipeline Failures: Fantasy Planning, Black Swans and Integrity Management” examines the causes of the high-profile San Bruno and Marshall ruptures. The book argues that although the incidents were profoundly surprising to the pipeline operators involved, from a broader perspective they were not surprising at all because of the human, organizational, and regulatory failures.
- API Recommended Practice 1173 established the framework for operators to implement Pipeline Safety Management Systems (SMS). A significant part of this recommended practice is a training and competency aspect.
- PipelineSMS.org is a useful resource with various safety tools that were developed by pipeline operators to help other operators enhance safety in their operation. Read the website resources or email pipelinesms@api.org with inquiries.
- The Plan Do Check Act Cycle (Deming Method) is embedded in Pipeline SMS as a continuous quality improvement model consisting of a logical sequence of four repetitive steps for continuous improvement and learning.
- Maximum Allowable Operating Pressure (MAOP) is the maximum pressure at which a pipeline or segment of a pipeline may be operated under PHMSA regulations.
- MAOP (maximum allowable operating pressure) was included in a bulletin issued by PHSMA informing owners and operators of gas transmission pipelines that if the pipeline pressure exceeds MAOP plus the build-up allowed for operation of pressure-limiting or control devices, the owner or operator must report the exceedance to PHMSA on or before the fifth day following the date on which the exceedance occurs. If the pipeline is subject to the regulatory authority of one of PHMSA’s State Pipeline Safety Partners, the exceedance must also be reported to the applicable state agency.
- Integrity Management (Pipeline Integrity Management) is a systematic approach to operate and manage pipelines in a safe manner that complies with PHMSA regulations.
- DIMP (Distribution Integrity Management Program) activities are focused on obtaining and evaluating information related to the distribution system that is critical for a risk-based, proactive integrity management program that involves programmatically remediating risks.
- High Consequences Area (HCA) is a location that is specially defined in pipeline safety regulations as an area where pipeline releases could have greater consequences to health and safety or the environment.
- The referenced incident in West Virginia was the Columbia Gas Transmission Corporation Pipeline Rupture in Sissonville, W. Va. on December 11, 2012. [Read the full NTSB Incident Report.]
- Lagging Indicators are facts about previous events after an incident occurs (e.g. injury rates).
- Leading Indicators are pre-incident measurements that have a predictive quality (e.g. monitoring gas level).
Pipeline Risk Management: Full Episode Transcript
Russel Treat: Welcome to the Pipeliners Podcast, episode 221, sponsored by EnerACT Energy Services, supporting pipeline operators to achieve natural compliance through plans, procedures, and tools implemented to automatically create and retain the required records as the work is performed. Find out more at EnerACTEnergyServices.com.
[background music]
Announcer: The Pipeliners Podcast, where professionals, Bubba geeks, and industry insiders share their knowledge and experience about technology, projects, and pipeline operations. Now, your host, Russel Treat.
Russel: Thanks for listening to the Pipeliners Podcast. I appreciate you taking the time. To show that appreciation, we give away a customized YETI tumbler to one listener every episode. This week, our winner is Tina Beach with CHS. Congratulations, Tina. Your YETI is on its way. To learn how you can win this signature prize, stick around till the end of the episode.
This week, Sheila Howard is returning to talk about data, dashboards, and risk management. Sheila, welcome back to the Pipeliners Podcast.
Sheila Howard: Hello, Russel. Nice to be back.
Russel: We were talking about that off microphone. I had to tell the listeners that Sheila doesn’t realize how good she is at doing this because she got a lot of great information and great data, but it’s not something that she likes to do. I just wanted to recognize Sheila for agreeing to do this. Thanks for being here, Sheila. I’m very much truly mean that.
Sheila: Thank you. You’re welcome.
Russel: Let’s dive in. Probably, we want to talk about data and dashboarding and risk management as a way to get into doing a better job of safety management.
Maybe a way to tee this up, you were mentioning a book that you have read, and I’m going to have to get a copy of it and read. Tell us a little bit about the book, “Nightmare Pipeline Failures,” and what some of your takeaways were from that book.
Sheila: We refer to it in-house here as “The Black Swan” book because it’s called “Fantasy Planning, Black Swans, and Integrity Management.” The book references overarching human factors or organizational failure as the root cause to the majority of the big incidents that have happened.
It highlights a couple of key failures that have happened in the past in our industry, and just talks through, like I said, the different organizational pieces. It talks a lot about the development of that API 1173 and how the Plan Do Check Act cycle is beneficial at curbing some of these human factor issues.
It talks about, just even some general MAOP determination, grandfathering, integrity management, risk assessment, what’s the meaning and how individuals play a role in the safety culture.
One of the big things, one of the big lines I picked up on that I thought was interesting was “Many individuals are unaware of the role that they should play in preventing serious accidents. They fail to link their day-to-day actions with the potential for disaster.”
It’s true. Everybody just goes through the motions checking their boxes and sometimes don’t understand necessarily how they fit in the big picture.
Russel: That’s actually one of the reasons I started doing this podcast. I think pipelining by its nature is highly technical. We tend to exist in silos doing the part of the job that we do.
We don’t often get an opportunity to understand all the other things that are going on and get a sense of how our role fits into everything else that’s going on from an overall safety performance standpoint. That’s right on point.
When I hear this and I think about the big incidents, most of these are at their root, human failures. By that, I don’t mean that people are deliberately doing something wrong.
It’s just that there’s something about their level of training, their level of competency, the communication that’s occurring, something about the systems they’re using that’s contributing to the outcome. It’s a mismatch between the machine and the human.
Sheila: Correct. In-house, we call them the org bits. It’s your procedures, scheduling, resources, training, communications, the equipment you use. Even the most recent, the big incident, even in the documentation and paperwork, it said that no individual employees are even going to be charged as the case was a result of not one person’s actions but a complete organizational failure.
Russel: That’s a really deep concept. I want to come back to this, but it might be helpful to talk a little bit about what kind of data are we currently collecting and how are we currently using that data to predict our safety performance? What kind of data are we currently collecting?
Sheila: The majority of the data that we collect now, I would put in the bucket of a lagging indicator. The majority is attributes or environmental attributes to the actual equipment itself.
Just for clarification. We’re not talking about hardhat safety. We’re talking about pipeline safety, big incidents, not slips, trips, and falls, which do play a role, but we’re talking about asset safety and what happens there.
Traditionally, it’s going to be your DIMP leak data. It’ll be things like excavation damage. It’s going to be all those kinds of what’s already happened from a risk perspective or, again, more the attributes environmental. Even on the transmission side, they take into account the soil, more of those types of features and factors in the calculations, and nothing as it relates to human factors.
Russel: You could get to a prediction. I can look at my DIMP leak reports, and I can predict into the future what my risk is. I could do the same thing with excavation damage or environmental releases, or any of that kind of data. I could use that as a mechanism to predict future performance.
If the incidents are related to organizational failure and the data that we’re collecting is related to asset integrity information, there’s a disconnect there between what’s actually leading to the incidents versus the things we’re monitoring and trying to improve.
Sheila: Yeah. I don’t want to minimize that data that we’re collecting. It’s all very beneficial. I just think that there should be an organizational failure factor that you’d apply to the probability of the incident.
It’s the old adage. You put a new pipe in the ground versus an old pipe in the ground. You put the new pipe in. You just let it sit there. You don’t do anything about it.
You have an old pipe that’s been in there for 30 years, but you follow your maintenance. You follow your procedures. You do all your testing. You know everything about that pipeline. The question is which one’s more at risk?
Russel: Here would be my answer to that question. They should both be equally at risk. The distinction would be the people and their capability, their tools, and their procedures, like the details of that part.
Sheila: Correct. That would be something that you’d have to look at for the 30-year-old pipe. If you don’t know anything about the new pipe, to your point, when that new pipe was built, it’s not just the integrity piece. It’s the full life cycle: design, procurement, construction, commission, operation.
There are so many different factors that are involved in determining what the risk is. That’s the point, is that we can’t limit it to just what soil is in there or if there’s been a leak in the past.
Russel: Gosh, I’ve done so many podcasts now on the integrity management subjects around geotechnical, and cathodic protection, and corrosion, and cracking and dents, and all these things that occur to the physical asset, and all of those certainly go to your safety risk.
Often, when you actually look at the incidents that we’re having, they’re not related to those kinds of things. They’re more organizational failures, probably because we’re not looking at it as hard in our industry. Does that make sense to you? Would you agree with that?
Sheila: Yeah, I would agree with that.
Russel: What kind of data should we be collecting on the people side that might be a safety indicator?
Sheila: We need to start with those organizational pieces. If we send out communication, feedback requests, and reach out to look at: are the procedures that we have, are they current? Are they even accurate?
We have a lot of people that have a lot of experience, that have been doing certain tasks and following certain procedures, just in their head for many, many, many years. They just go through the motions and not necessarily need to look at the piece of paper.
If new people come in, if that piece of paper isn’t correct or current or accurate, that could bring you some risk from following a procedure perspective.
Russel: Certainly, in the stuff that I’ve done, I know that you can give somebody a procedure. If you don’t take them back to the fundamentals on some recurring basis, it’s very easy to drift off of the procedure over time and not really even being aware that you’re doing that.
Sheila: Exactly, yeah. Correct.
Russel: Do I have procedures? Are they accurate? Am I following them? Those are some things that I could…
Sheila: Do I have them?
Russel: Yeah, do I have them? Right. Do they exist? Can I find them? [laughs]
Sheila: Yeah, some basics.
Russel: Yeah. What are some other things beyond just the procedures?
Sheila: I rattle them off earlier, but even scheduling, are we prioritizing our work accurately? Do we have the right resources in place as far as competence and training? Do we do any kind of regular training or what’s the word I’m looking for… the effectiveness of the training?
Do we do real-time exercises to make sure that people are not just answering questions? If they’re in the field building something, they actually know how to, in fact, build whatever they’re working on.
Is there communication between groups? 1173 talks a lot about that whole communication piece. I told you an example. In West Virginia, they had an incident where they had two lines in parallel. One was an HCA. One wasn’t. They did all the HCA inspections, found corrosion, repaired everything.
Everything in their mind is good to go, but they didn’t inspect the other line because it was outside the bounds. Then that line blew up and burned the house down and hit a highway. You need to be able to communicate these issues that you have with multiple groups and look at the bigger picture.
Russel: There’s some other things that come up for me, too, in this domain is do I have the right tools for the job? Are my tools in good order? Is there good supervision? Is there good supervision of work that’s occurring?
Scheduling is an interesting thing because everybody in our world is always challenged by getting everything done in the time frame it needs to get done and making sure that anything that slips is the lower priority stuff. There’s this tension between the regulatory requirement and what we really need to be doing to ensure proper safety.
Sheila: That’s the whole compliance versus safety. Just because you check the box doesn’t mean you’re safe. The compliance and adherence to compliance alone kinda weeds out or minimizes some engineering judgment or professional expertise.
If somebody says, “Well, you got to sign this piece of paper,” it doesn’t matter what the paper says. You need to have it signed from a compliance perspective. A lot of times, you’re finding people just do what they’re supposed to do to check the box, not necessarily have good thought behind what they’re doing.
Russel: Right. Exactly. That makes perfect sense. For me, when you start talking about things like, “Do I have procedures? Am I following the procedures?” How do you actually collect that as data?
Sheila: I always say you need a tool for everything. If you have a tool that actually can reach out and aggregate all this information and you have a big enough pool for sample size, you can collect this data and actually focus on structured responses in order to accurately determine either the impact and/or what needs to be fixed.
If you get this structured data and you collect it all, you can actually put a number behind it, and then apply that to your probability of failure modeling.
Russel: Basically, I get one of these survey forms, and it’s asking me related to my job, “Do you have procedures to perform these tasks? When was the last time you were trained on those procedures? Can you locate a hard copy?”
Whatever those questions are, each organization might be a little bit different, but you just ask the people that are doing the work those questions.
Sheila: You try to get it specific to the tasks that they do. Again, the more structure, the better, from an analysis and dashboarding perspective.
If you leave it lofty, like do you have the procedures you need or does this particular task that you do have a procedure? Is it current? Is it accurate? Can you find it? The idea is to try to get as detailed as possible to collect the most valuable data.
Russel: To me, that’s okay. That one’s pretty straightforward. I can see how you could do that. That’s pretty straightforward. I could see how you could dashboard the actual results.
If I’m looking at a particular task and that task is done by 100 people in the organization, 80 percent of them respond, and my answers are this. I can put some numbers around that and do some cool little dashboard techniques, but that, in and of itself, doesn’t necessarily get to a, “How am I improving my safety posture?” kind of conversation.
Sheila: Having all of that information, though, does give you a leading indicator to a potential problem. First, 80 percent of respondents is a form of analysis. We can analyze that in and of itself.
So, you have your culture as a whole. If you have 100 people, why aren’t the 20 responding? Are they on vacation? Do they not care? All that factors into your safety posture and your safety culture.
If you have everybody responding, you have a high percentage. Then, of those responding, if you have a low percentage of failure, then maybe there’s just a one-off. There’s a couple of people who need to be trained.
If you have a high number, maybe you have a systemic problem that needs to be addressed, in which case you could look at those as a factor that needs to be addressed.
Russel: Right. What I’m getting at is you can certainly draw some conclusions or get some clarity about other questions I ask, like why questions like, “Why am I seeing this number when I think I’d be seeing this other number?” That kind of thing. Through that, discover some things that systematically need to be resolved.
Sheila: On the big picture, you can’t fix what you don’t measure. Step one is you have to have this data to even have this discussion.
Russel: [laughs] The conversation makes me laugh because I had a significant emotional event early in my career with a full bird colonel in the Air Force who was explaining to me management by objective and the fact you can’t manage what you don’t measure. They got burned into my brain deep there, Sheila. You’re touching a chord, man.
Sheila: [laughs] It’s a very true thing. You can only guess. It goes back to those same people with the biases that they’re doing the right thing in the workplace based on their history and their memory.
I’ve worked with a lot of people. I have 30, 40 years of experience in a particular technical trade or field. Those guys are the go-to guys. You talk to them. They know what they’re doing. Again, it’s all a gut until you actually show them a procedure. You can assume that they’re doing the right thing, but you don’t know. You need the data.
Russel: Great engineering always has great documentation. You can’t do great engineering with poor documentation. You just can’t. This is the same thing in safety. You can’t have a great safety program without great documentation. In the safety world, that’s your training, your policies, your procedures, all that.
Sheila: That all leads to your highly reliable organization, another buzzword that’s focused on in the book.
Russel: Interesting. When you think about what industry does this really well, you have to think about airlines, and they do this really well. Airlines are a very safe way of getting from point A to point B. They’re a very safe way of doing that. They’re very efficient way.
Sheila: They’ve had an SMS program for many, many, many years before us.
Russel: Yeah, decades.
Sheila: The 1173 mimics what they’ve already been doing for a very long time to get to establish where they’re at today.
Russel: When I first got out of the military, this would have been in ’85, I worked with a guy who at that time was probably right around 60 years old and had a job where he traveled all his life, and he had lived through three plane crashes.
Now, you have to put this in context. Back in the ’60s and ’70s, plane crashes were not unheard of. They didn’t happen often, but they weren’t unheard of. Now, they’re pretty much unheard of. They just pretty much don’t happen.
Sheila: Yes. Correct.
Russel: It’s because of all those kinds of things.
Sheila: It’s the same thing collecting all that data ahead of time. They have a lot of leading indicators to prevent problems. We work with a gentleman from American Airlines who talks a lot about this. Just the people, the different airline pilots, or maintenance, anybody can put in different feedback to recognize when something is not right.
He always provides an example of someplace in Memphis that had two different spellings of blues for a landing point, and nobody ever noticed it, but you can see how that could cause a problem. If somebody thinks they’re going to blues in one spelling, and then go to the wrong blues in a different place.
Definitely getting those leading indicators, people providing feedback and information, asking the questions, having the open dialogue, having a tool to be able to present all this information in an effective manner. If everything was pen and paper, you’d be bogged down and never get anywhere.
Russel: Right, because you got to submit it. Somebody’s got to analyze it to determine if the actions needed and the actions got to happen. Then you got to round trip the communication back to the person that put in the recommendation and all of those impacted.
Sheila: Even following up past that. Part of the Plan Do Check Act is even continuing past that. Even after the implementation, did we implement the right thing? Check what you just did. Make sure that it’s being beneficial in what you expected.
Russel: I wanted to transition a little bit and talk about…Once I start sending out these inquiries, and I’m getting this data back and the data is structured, and I’m able to see things like response rates and level of training and the validity of procedures and things like that on a specific task basis, I can start pulling that data and pulling it into a dashboard.
What would it take to actually correlate that into a safety program? Maybe, a better way to ask that is: can I put a probability of failure on those kinds of things?
Sheila: I would put it as a factor. Similar to what we do for a lot of our DIMP leak data, I know we’re talking lagging, but just from a factor perspective, as we take a five-year history of leak data accidents, injuries, and fatalities, and we take that factor, and we run it across the whole risk algorithm in order to get a normalization to what and how it would impact.
You could do the same thing with the probability piece of the risk model because you’re still going to have the probability of failure based on soil, based on the type of material, you’re still going to have asset pieces. You could put a holistic factor against the probability side of the equation and help determine how that impacts your overall risk.
Russel: That, to me, is a pretty compelling idea. If I just think about a typical integrity management cycle, I’ve got my tool run. I’ve got my data analysis. I’ve got my features requiring excavation. I’ve got my dig plan, all these various components of a typical kind of integrity management process.
I could look at each one of those pieces of the organization, and I could develop an organizational probability of failure based on these leading indicators and then apply that to the overall probability for the asset.
Sheila: Similar to how we do weight factors now, like some material gets a higher weight than other, maybe a proximity closer to central business districts, things like that. You can put any kind of weight or factor on the probability side.
Russel: If you did that, what that would do is it creates a mechanism that would justify a spend on things like training and procedures and the tools that might not otherwise get justified. So, why don’t we just do that?
Sheila: [laughs] Sounds so simple, doesn’t it? Most people are looking at it. Most people don’t have the tools. There’s lots of reasons why it hasn’t been implemented. I can’t say it holistically that it hasn’t. I have not seen it.
Russel: People get stuck with, “We’re really good in our business of applying a model that we know works. We’re not that good at coming up with a model that’s not yet been developed.”
Sheila: That’s a great point. That’s another piece in the book that we talked about in the beginning of the episode here is being able to come up with a model that you can test is beneficial. You need the results to be able to prove out your model for effectiveness. It’s very difficult to push failure as a test scenario, especially when it comes to these organizational factors.
Russel: Yes, it is. Yes, it is. What do you think that all pipeliners should take away from this conversation as it relates to safety management, risk management?
Sheila: Get feedback from the people. Nobody knows what’s going on in the field and nobody knows what people are doing other than those people that are out there.
Anybody in the office, anybody in the field, everybody has a job to do. It’s really beneficial to get their feedback and hear what things are happening. If things are happening in the right time, the right place, the whole quality management principles, just making sure that things are being done the way they are supposed to be done.
Russel: Again, sounds easy if you say it fast.
Sheila: We seem to end every podcast with that statement. [laughs]
Russel: It’s the nature of what you and I talked about, actually.
Sheila: Yeah, yeah.
Russel: The other thing I would say is try to add some leading indicators into what you’re doing.
Sheila: Agreed.
Russel: Or maybe another way to say that, more specifically, is try to add some human indicators. What’s the human contribution of the organizational health contribution to the overall safety, performance, and probability of failure?
Sheila: Absolutely. Especially since that seems to be the key takeaway from many of these large incidents.
Russel: That’s absolutely right. As always, you did a great job. Thank you so much for coming on.
Sheila: Thank you.
Russel: I enjoyed the conversation, and I learned some stuff. I’ve got a book I need to go read.
Sheila: It’s not a slow read. It’s pretty good. I like it. It’s very informative, thought-provoking. It’s a good read.
Russel: Just for the record, while we’re on the podcast, I went and did an Amazon search, and I found the book, and it’s available. Help yourself, get a copy.
Sheila: We’ll put it in the show notes.
Russel: We’ll do that. We’ll do that. Thanks, Sheila.
Sheila: Thank you, Russel. I appreciate it.
Russel: I hope you enjoyed this week’s episode of the Pipeliners Podcast and our conversation with Sheila. Just a reminder before you go, you should register to win our customized Pipeliners Podcast YETI tumbler. Simply visit pipelinepodcastnetwork.com/win and enter yourself in the drawing.
If you’d like to support the podcast, the best way to do that is to leave us a review. You can do that on Apple Podcast, Google Play, Stitcher. You can find instructions at PipelinePodcastNetwork.com.
[background music]
Russel: If you have ideas, questions, or topics you’d be interested in, please let me know either on the Contact Us page or reach out to me on LinkedIn. Thanks for listening. I’ll talk to you next week.
[music]
Transcription by CastingWords