Pipeliners Podcast host Russel Treat hosts part two of a conversation with Adam Hill of Kepware/PTC on how to simplify EFM data collection using the latest technology.
In this episode, you will learn about the history of data collection in SCADA systems, the difference between the primary data polling protocols, the importance of edge communications for building the architecture that connects field devices to the host server, how to integrate IT and OT using today’s technology, and much more.
EFM Data Collection: Show Notes, Links, and Insider Terms
- Adam Hill is a Strategic Account Manager for Kepware Technologies. Find and connect with Adam on LinkedIn.
- Kepware is a software development business of PTC, Inc. Kepware provides a portfolio of software solutions to help businesses connect diverse automation devices and software applications and enable the Industrial Internet of Things.
- PRESENTATION: Download Adam’s ISHM 2019 presentation, “Simplifying Real-time and EFM Data Collection.”
- PRESENTATION: Download Adam’s ISHM 2019 presentation, “OPC Overview.”
- IIoT (Industrial Internet of Things) is the use of connected devices for industrial purposes, such as communication between network devices in the field and a pipeline system.
- PLCs (Programmable Logic Controllers) are programmable devices placed in the field that take action when certain conditions are met in a pipeline program.
- EFM (Electronic Flow Meter) measures the amount of substance flowing in a pipeline and performs other calculations that are communicated back to the system.
- RTUs (Remote Telemetry Units) are electronic devices placed in the field. RTUs enable remote automation by communicating data back to the facility and taking specific action after receiving input from the facility.
- A type of RTU is an EFM Flow Computer that measures the flow of gas or fluid and reports the data back to the facility. It differs from an RTU in that it is designed to compute flow using standard flow equations with specific timing and reporting requirements.
- SCADA (Supervisory Control and Data Acquisition) is a system of software and technology that allows pipeliners to control processes locally or at remote location. SCADA breaks down into two key functions: supervisory control and data acquisition. Included is managing the field, communication, and control room technology components that send and receive valuable data, allowing users to respond to the data.
- OPC (Open Platform Communications) is a data transfer standard for communicating device-level data between two locations, often between the field and the SCADA/HMI system. OPC allows many different programs to communicate with industrial hardware devices such as PLCs. The original system was dependent on MS Windows before shifting to open platform.
- OPC DA (or OPC Classic) is a group of client-server standards that provides specifications for communicating real-time data from devices such as PLCs to display or interface devices such as HMIS and SCADA.
- Edge Communications is a method of building out the architecture for structured communication from edge devices in the field to a host server using connectivity to poll and transmit the data.
- MQTT (Message Queuing Telemetry Transport) is a publish-subscribe protocol that allows data to move quickly and securely through the system and does not bog down the system with unnecessary requests.
- IT/OT convergence is the integration of IT (Information Technology) systems with OT (Operational Technology) systems used to monitor events, processes, and devices and make adjustments in industrial operations.
EFM Data Collection: Full Episode Transcript
Russel Treat: Welcome to the Pipeliners Podcast, episode 76, sponsored by Gas Certification Institute, providing training and standard operating procedures for custody transfer measurement professionals. Find out more about GCI at gascertification.com.
[music]
Announcer: The Pipeliners Podcast, where professionals, Bubba geeks, and industry insiders share their knowledge and experience about technology, projects, and pipeline operations.
Now your host, Russel Treat.
Russel: Thanks for listening to the Pipeliners Podcast. I appreciate you taking the time. To show that appreciation, we are giving away a customized YETI tumbler to one listener each episode
This week, our winner is Will Gage with Targa Resources. Congratulations, Will, your YETI is on its way. To learn how you can win this signature prize pack, stick around until the end of the episode.
This week on the Pipeliners Podcast, Adam Hill with Kepware Technologies returns to talk about simplifying data collection. Adam, welcome back to the Pipeliners Podcast.
Adam Hill: Great to be here, Russel. Thank you for having me back.
Russel: We talked last week about the fundamentals of OPC. This week, we’re going to talk about simplifying data collection. I think this is a big subject area, personally. I think the way that we’ve been collecting data to get it up from the field and to the SCADA system and out to the measurement accounting systems, I don’t think that’s changed much in 10 or 15 years.
Maybe the way to dive into this is just talk about what’s the nature of the market as it relates to data collection, what’s going on, what’s happening.
Adam: Definitely. I’m giving another talk next week on this very topic. We work with customers all the time that have very unique architectures and challenges and requirements to collect both real-time and EFM data from devices, PLCs, and computers and whatnot.
The building blocks of some of this, as we talk about in the previous podcast, still come into play with increasing complexity, where you’re collecting data from a multitude of devices in the field, needing to serve up to applications like SCADA, even crossing over into IoT requirements and needing to get data into big data analytics packages.
There’s a lot of increasing complexity that still resides, and always will, but sort of wrapping around that, and putting things in place like a data collector, for example, to streamline the communication bridge between your devices and your client applications, could certainly make a lot of sense.
One of the things I touch on during this particular topic is what today’s SCADA looks like, and touching on the IoT architecture of what companies are doing to get data out of SCADA. For example, with custom applications and whatnot, then segueing into different avenues, as we push things closer and closer to the edge, if you will.
Russel: Definition of the edge.
Adam: [laughs] That’s a good question.
The definition of the edge, I would ask you the same question, because I certainly understand your expertise. We are the software vendor. From our perspective, it’s as close to the device…
Russel: Definition of device.
Adam: PLC, RTU, EFM — in the field.
Russel: I actually have a little different definition of the edge. This is interesting.
Adam: Yeah, this is good.
Russel: My definition of the edge is right by the instrument.
Adam: I like it.
Russel: However, right now, the edge is right behind the field device, the PLC, the RTU, the EFM. Because you don’t really have edge devices in our world — in oil and gas — that are actually talking to the instruments, but that’s coming.
The edge is basically where the data’s at, where the data’s coming from. You could ultimately think about the edge as being the valve pressure, that sort of thing. That’s my definition.
Adam: I like that. I like it. I’m learning something new every day.
[laughter]
Russel: That’s good. I like to learn. Learning’s excellent.
One of the things I think about, what makes data collection complicated? I’m going to just do a little history. 20 plus years ago, SCADA and automation was a big, monolithic thing. You had devices out in the field. You had telecommunications. You had big iron systems and all kinds of custom code that would translate from instruments to numbers on a screen on a computer system. That all existed as one big monolithic block.
Like a lot of other technologies, what’s been happening is that monolithic block has been getting decomposed. The instruments and the RTU are clearly segregated. The RTU or the PLC’s got intelligence in it now.
Communications has become a commodity. It doesn’t matter if I’m moving email or data or graphics or what. The communications is just communications. It’s all becoming IP rather than I’ve got some serial and I’ve got some IP and I’ve got some other proprietary stuff at satellite. It’s all normalizing.
I think one of the things that’s going on is — because communication’s become a commodity and the tools and ability to move the data has become standardized — now there’s a whole lot more people that want the data. That’s one of the things that’s acting would be what I would assert.
I think the other thing that’s acting is some of these people that want the data want it for new purposes. I’m thinking, in particular, data analytics. Everybody’s hearing about data analytics. Basically, that’s getting us a timed series of data and applying math and statistics to it to learn something meaningful to support decision-making.
I think what’s happening is we’re actually seeing complexity and requirements evolve pretty rapidly. I would guess you’re a vendor and you guys work in a lot more than just oil and gas. What are y’all seeing in that domain?
Adam: Oil and gas is definitely our most challenging, unique vertical with regard to connectivity and networks and whatnot. We have customers in food and beverage and automotive and whatnot that have brand new plants with perfect Ethernet communications throughout the plant. Not every vertical is structured that way.
It’s exciting to work with customers in oil and gas because of their unique requirements and the need to get data from sometimes very old equipment that’s been acquired through divestitures or acquisitions or what have you.
Certainly, reducing the complexity we talked about that resides in the data collection piece and putting a data collector in between your client device layer, your SCADA, to your PLC certainly makes a lot of sense. Moving into EFM data collection as well, we’re talking about simplifying two different types of data collection.
The unique challenges of needing to extract EFM data from flow computers in the field for obvious reasons, due to custody transfer purposes, is something we’re seeing a lot more and more of, customers needing to expose that level of data into different parts of the organization. That’s something that’s unique.
Russel: I’ve been doing this a long time. When I first started to get into this business, it was not uncommon to go to a field site and you’d see one communications drop for SCADA, another communications drop for measurement, and sometimes even another communications drop if you’re at a compressor station or something like that that was supporting machine diagnostics.
Nowadays, you go to a station, there’s one communications drop. It’s all the same data. There’s no reason for three different sets of communications infrastructure to get it. That’s certainly one thing that you’ve seen change.
I think the other thing that you’re seeing change is that there’s a lot of points of data concentration. Before data concentrated, again 20 years ago when I got in the business, data concentration was done in PLCs or similar equipment. Nowadays, data concentration’s almost always done in software.
Anyways, what are the challenges of EFM collection versus real-time data collection?
Adam: That’s a really good question. I would certainly like to hear your answer to this question back to the learning piece as from software vendor to measurement expert and whatnot because that’s the biggest thing for us as a software vendor.
We really need to take a step back a lot of times and understand that we’re not working with a particular customer that has a line on a plant for with perfect communications, wired type architecture. We’re dealing with very complicated networks here to pull real-time EFM data from various devices. I think the protocols are challenging to support.
One of the things that we found is some of the older protocols for EFM data collection. We’ve struggled with being able to adapt and adopt support for those. I would say that’s one of the big ones with the EFM data collection is make sure we support all types of flow computers, even the really old stuff.
Just keeping up with where the market’s headed and some of the unique challenges that people are facing with respect to data collection. One of the things we do well is modifying storage requirements for this data, being able to port data into different measurement applications, CSV files, database applications — wherever it needs to go — making sure we do that correctly and keeping up with standards.
Again, from the software vendor perspective, it’s very important to keep a pulse on the industry with regard to measurement data. In addition to real-time data, yes, but when it’s the measurement piece keeping up with requirements from these customers because we’re talking about a different type of data that’s being collected from a different type of device.
Russel: That’s really what I was going after, and you alluded. For the listeners, Adam and I know each other pretty well. Adam knows my particular technical expertise. I’m pretty deep in this whole measurement and custody transfer conversation. This is a little bit of a softball for me if you will. Real-time data is pretty simple. It’s small data sets. It’s a list of values in a timestamp. It’s pressure one, pressure two, temperature one, temperature two in a timestamp. It’s simple.
When you start talking about measurement data, I’m now really talking more about something you would typically do in a database. All of these field devices — the EFMs, the PLCs, and all that — if you go back to how they were originally built, they were all originally built as simple, time-series databases. They would give you a list of registers and “here’s the value right now.”
When you start trying to historize all that and bring that into a data set and I’m going to accumulate it for this hour — here’s my total volume, here’s my total energy, here’s my average temperature and my average pressure, my configuration was this, etc. — all of a sudden this is no ordinary simple data set. It’s a complex data set, and I have to move the entire block of data.
A partial block is not useful or meaningful. That adds a lot more complexity. These EFM protocols need to be more robust. They need to be more reliable. Oftentimes, they need a more reliable communications network as well to communicate. That’d be my take on it.
Right now, the method of doing the EFM collection would be I put in some kind of communications network. Then, I install a server someplace. That server’s reaching out to all the devices and bringing the data back. The server tends to be centralized. What do you think the future holds for this? Where are we headed?
Adam: Great question. This is really the nexus of where we’re at today regarding what customers are doing. A lot of these are POCs where they’re just trying new things. Different ideas, different architectures are being implemented.
Really, the current architecture for collecting real-time data and EFM data for SCADA and measurement applications would be draw a dotted line between your PLCs, your EFMs, your RTUs that are in the field. Then, above that is your enterprise so that the data collector resides within the SCADA. The polling engine resides within the SCADA application, right beside it, or what have you.
Really where things are headed, obviously with today’s SCADA and IoT architecture there’s a lot of custom development and interoperability issues to bridge the IT/OT gap to get data into big data analytics packages and out of SCADA. Obviously, measurement applications are involved there as well. There’s various challenges to doing that.
You’re very familiar with this concept, Russel, and have talked with us about it many times, but this tightly integrated, loosely coupled application idea.
Russel: Will Gage and I recorded a podcast on that exact topic a ways back.
Adam: Love it. Love it. Think of these behemoth SCADA applications where the polling engine would reside within the SCADA. As you know, that idea of decoupling that piece from the SCADA and then in certain customer architectures putting that south of SCADA to do the data collection piece. Then, move data where it needs to go from the central data collector.
The other thing we’re seeing, which is rather unique, a company in Oklahoma City implemented a way to use our data collector north of SCADA. They put a data collector north of SCADA, which had a polling engine. They put the data collector north of that to poll data from eight SCADA servers using OPC DA. This SCADA application had an OPC DA server associated with it.
They were able to use a DA client driver to pull data from each SCADA instance, aggregate the data, and then, using MQTT, actually publish data to a broker, then from there, other applications within the organization, other IoT applications.
Russel: Let’s talk about what is MQTT? I’ll bet most of the people that are listening to this have never heard that before.
Adam: [laughs] I’m certainly not an expert on MQTT. I will put that right out there. I’m more familiar with other protocols, certainly more familiar with OPC.
What MQTT is from just the barebones and my understanding is it’s an IT centric protocol for data exchange. It allows you to move data in different ways. You can think of it as a different pipe out of your data collector.
Russel: I’ll give you my take. Again, I wouldn’t put myself out as an expert in MQTT. It is something I’ve looked at pretty hard. I think there’s a couple things you have to talk about.
The first thing is what is publish/subscribe versus poll/response? Most communications in SCADA in automation is poll/response. The host computer asks for the data and the remote response. I don’t get any data unless I ask for it. That’s poll/response.
If you think about it, your email works in a pub/sub way. When I push send, I’m publishing my email. Then when somebody else gets on their client, they get it. All the people that the data’s addressed to get the data. They get it when they ask for it. They could be permanently subscribed and get it as a real-time feed. It’s a fundamentally different way of moving data.
MQTT is a pub/sub. Pub/sub has been around for a long time in the business world. There’s whole companies that are built around that. That pub/sub is more of a TCP/IP business network kind of communications protocol. MQTT is a low-profile, low-overhead, secure, real-time pub/sub protocol.
I know I just said a whole boatload of buzzwords there. Bottom line is this. Anybody who needs the data can go to the MQTT broker or server and say, “Here’s the data I need.” Then whenever that data’s available, they get it.
It’s for real-time data, not a data lake where all the data I ever got is in the data lake. It’s just the real-time data feed. It’s a different way of thinking about this stuff.
Adam: It is definitely. We’re seeing two specific POCs.
Russel: POC being proof of concept.
Adam: Exactly.
Russel: You got to be careful about those TLAs [three-letter acronyms].
Adam: [laughs] I know. It’s exciting.
They’re not full blown adoptions yet. These companies are trying new and unique ways to move the way in which they’re collecting the data around. You touched on those concepts of poll/response versus publish and subscribe, changing the methodology a little bit of how data is collected and served up to various applications. Again, you touched on it, we’re talking about real-time data, not EFM data.
I’m working with a few companies, I know very little about this yet, new companies that are looking to expose even the measurement data to different parts of the enterprise. There’s a lot coming there as well.
Russel: I agree. I think this is transformational for our business. Right now, there’s a lot of people that they get data in the morning when they come in. They don’t see data again until they come in the next morning. The reality is very much that that’s going to change. We’re all going to be able to see the data in virtually real-time. There’s going to be lots of people consuming it.
One of the things that’s interesting when you start looking at analytics is I need the data contextualized for the particular kind of analytic I’m doing. If I’m doing a machine analytic, that’s a different contextualization than I’m doing a capacity analytic than I’m doing a profitability analytic.
They all use the same raw data but contextualized a different way. That’s where something like MQTT can really add a lot of value.
The purpose of this is to simplify. One of the things I need, the purpose of this conversation, is to get some clarity about simplify. What do I need to do to simplify all this stuff that we’ve been talking about making complicated?
Adam: Correct me if I’m wrong. I think you’re alluding to pushing the data collection piece closer if not to the edge.
Russel: Maybe. Certainly, that might be one part of it.
I think I would talk about it, Adam, a little bit more conceptually in that the part that most people will agree is the most complex is when I’m dealing with the native protocol and I’m getting it right. That’s the part that’s the hardest.
If I’m not getting the data to measurement, if there’s some issue and I’m missing something or something strange is happening with the data and I don’t understand why, that almost always gets to what protocol am I using, how am I using it. It gets all the way down to that level of detail frequently. You got to understand how that stuff actually works.
When you start getting the data normalized, now I can just move the data around because it’s normalized. I think part of it is data normalization.
Then I think the other part of it is, simplify the communications. That would be my answer. When you start talking about, “How do I simplify the communications?” I think that’s where you start saying, “Look, do all the data collection and normalization at the edge. Then, after that, use some standard to move it around, like MQTT.”
You don’t have anything to add to all that? You’re going to let me pontificate?
Adam: [laughs] The edge discussion and a lot of some of the things you just mentioned, I’m even in a place at Kepware — and a lot of us are — where, especially with regard to this industry, we’re learning about the way things are evolving within the industry and the requirements of certain customers. The whole edge discussion, especially for me, is something that I’m learning about each day.
It’s very different for us too because we’re supporting operating systems that we may not have necessarily supported before. Even for our development teams and QA teams, there’s a lot happening that’s allowing us to move in this direction. Things are definitely changing daily for sure.
Russel: I’m with you. A lot of things, when you start talking about the edge and MQTT and such as that, there are some people that are doing some stuff with it. Most of what’s being done is experimental or proof of concept. They’re trying to understand it and determine what value they can drive out.
There’s not a lot of real projects. I do know of some. The ones that I know of tend to be more machinery centric, compressor optimization, that sort of thing. They’re not really measurement centric, although I think that’s coming. To me, it’s a really interesting conversation.
I think the one thing I would tell listeners to take away when you get into this conversation, I actually got asked a question one time, “How much data do I need to get in order to be regulatory compliant? How often? Do I need data every second, every minute, every hour?” That’s a really easy question with a really complex answer. I actually did a whole training thing. There’s a YouTube video out on that.
Not unlike that, this conversation about simplifying data collection, if I put all the same stuff in the field, it gets easier. If I always use an ultrasonic meter and I always use vendor’s EFM device and I always use a cell radio, it gets easier.
Problem is, there’s all kinds of reasons [laughs] why that doesn’t work, acquisitions and divestitures and economics and terrain and all that. To the extent you can standardize, it gets simpler.
Boy, I’ve had the conversation many, many times in this domain about business case that being the specific combination of what are you doing, what equipment are you using to do it, what communications — everything from what’s going on from the instruments all the way back to how the data’s being used. Every time you change a business case, it gets harder.
I’m going to do the same thing I did to you last time. I want to ask you, what are your three key takeaways from this conversation?
Adam: Great question. We talked about a lot of stuff today.
I think the big thing from an overview perspective is to understand that there’s a big difference between real-time and EFM data. That’s definitely a key takeaway in some of the elements we discussed today.
Also, understanding today’s SCADA architecture — the way in which companies are using custom develop tools to maybe get data out of SCADA and into other applications. But also understanding the options to introduce some of these tightly integrated, loosely coupled applications for getting data, using a data collector in a different way. That’s definitely another key takeaway.
The final takeaway is that we are moving to the edge. Certainly, there are applications out there and tools and edge gateway devices that folks are using. It’s an exciting time. It’s going to be exciting within the next five years or so or even less to see where things are headed.
Russel: I think the other thing, and I think we alluded to this a little bit in the conversation, is that as things move to the edge, the distinction between OT and IT’s going to blur, and the distinction between automation and IT’s going to clarify.
Adam: Yep, well said.
Russel: As always, Adam, it’s a pleasure. Thanks for joining me. For those of you that are listening to this, Adam has already done his presentation at ISHM. We’re going to link up some of that content on the show notes, including other information about Adam and such. We’ll build a show notes page so you can find him and you can find the materials that underlie this conversation. I’ll tell you, the pictures on the PowerPoint are very helpful.
Adam: Definitely.
Russel: Thank you, Adam. It was a pleasure.
Adam: Thanks, Russel. Take care.
Russel: I hope you enjoyed this week’s episode of the Pipeliners Podcast and our conversation with Adam Hill. Just a reminder, before you go you should register to win our customized Pipeliners Podcast YETI tumbler. Simply visit pipelinepodcastnetwork.com/win to enter yourself in the drawing.
If you would like to support this podcast, please leave a review on Apple Podcast or whatever device you use to listen to podcasts. You can find instructions at pipelinepodcastnetwork.com.
[music]
Russel: If you have ideas, questions, or topics you’d be interested in hearing, please let me know on the Contact Us page at pipelinepodcastnetwork.com or reach out to me on LinkedIn.
Thanks for listening. I’ll talk to you next week.
Transcription by CastingWords