Standards-Based Manufacturing Digital Thread and Twins.

When it comes to digital twins and digital threads, there is no widespread agreement what they are and how the can be expressed in terms of data artifacts. This presentation proposes a standards-based definitions and links this to standards-based data artifacts to go with this definition.

Transcript:

Jan de Nijs:

Okay. Well, good morning. Bright and early for us here. Today I would like to talk about manufacturing digital twin. My name is Jan de Nijs. I'm a tech fellow at Lockheed Martin. My specific field of expertise is the enterprise digital production topic, spend a lot of time on that. And I hope that I can shed some light on digital twins, and how standards can actually enables those. So, let me try to go to the next one. Okay, here we go. So, want to talk about today about digital twins, a little bit about applications, other things that have to do with it, give you some use cases, and come up with some conclusions. We can start, of course, to talk about what Lockheed Martin does. We're a defense contractor of course, many people know that. We're all over the world, 54 countries, but mostly in the US, mostly focused on the US, lots of scientists, lots of engineers. We're working on all kind of things, mostly in four different work areas. We have fixed wing. We have missiles, rotary wing, and other systems, and then things that actually leave the atmosphere, which we call our space division.

Jan de Nijs:

So, we basically have four pillars, if you will, that make up our product lineup. And I'm from the aeronautics department, the fixed wing working. I'm personally very much involved with F35 production. So, digital twins, there have been a lot of discussions what actually a digital twin is, and many people... What I'm finding is that basically all people have their own definition of it. The engineering world believes that they're nominal models. They're things that they produce in their CAT systems is the digital twin. Other people believe that there is such a thing as The digital twin, the twin that rules them all, and other people have complete different views on that. So, in the last couple of years at least, thank goodness, people are starting to come up with definitions of what digital twin might mean. As far as I'm concerned, I find the first active attempt at coming up with a digital twin by Grieves and Vickers a couple of years ago, and they start to probably, I don't know whether anybody did before but I can't find it, but they start to link everything to something physical. They start to focus on the twin, things that come in twos.

Jan de Nijs:

There has to be something physical that you're describing with something digital if you will, and that basically got further developed in what's now ISO 23, 24/7. ISO 23 24/7, it's a framework, but it also tries to come up with a further development on that early definition, and everybody can read what that is, but I kind of highlighted it. It focused on manufacturing. It looks for a purpose describing manufacturing elements, observable manufacturing elements at an appropriate rate of synchronization. I think the, in our community at least, the realization has come that digital twins really have to represent an observable element. So, in our world, even though there's absolutely no agreement, and I get that, but in our world, it seems to be that these digital descriptions need to match something that we can observe. Now, to get deeper into that because that's a pretty high in the sky concept, something that you can observe, so what is an observable element? Well, if you really look at it, the universe is an observable element. It's probably one of the most observed elements. We're doing that for 400 years, but there's many more.

Jan de Nijs:

And there's many more observable elements, Earth, Mars, humans. Humans are observable elements. But then, of course, when we start to go down into the manufacturing world, we can come up with an enormous amount of digital twins that we probably need to deal with. Observable elements are machine tools, rotating cutting tools, the manufacturing part in itself is also an observable element, but we're starting to stretch the definition of an observable element because we believe that actually a manufacturing process would also be an observable element. There is a nominal process, the thing that you actually lay out, this is the way I'm going to process a part. Now what you can do is is with your manufacturing process, you can actually say how it actually got done. One machine is not available because it had a breakdown, so we had to process it over another machine. So, there's an actual process and there's a nominal process. So, there's, again, you have the observable element, which is the process, and there's an actual, and there's a nominal.

Jan de Nijs:

So, in the bottom picture there, I think that always works for me. It basically says that there's a physical part to all this, and there's an interchange where we have digital artifacts. And on the right-hand side there's data flowing to all kind of analysis software, and all kind of things that are helping me run my business better. But they also flow back into these observable element, elements to actually make them run better, or make them run more. Or in case of machine tools, we want to have as much uptime as we can get. So, they basically help me use these observable elements in a much better and more efficient way. So, what's important is a scope limit. To me, to say that there is The digital twin is really something that's strange because what do you mean? The universe? What are we talking about? So, you need to really start to lower your focus. Your aperture has to come down. And so for this specific speech here. I'm going to focus on manufacturing. I'm trying to limit myself to just looking at observable elements within manufacturing. But then when you're looking at manufacturing, there is something more that you need to do.

Jan de Nijs:

To say like, "Okay, what happens for a rotating tool?" Is there The digital twin of a rotating tool? A lot of people say, "Yes, there is," which is the nominal CAT model, and I'm saying, "Well, there really is not." It really depends on what your purpose is. If your purpose of developing a digital twin is really how large, the scope of your digital twin is going to be extremely large, then there's a lot of digital artifacts that you need to describe that scope. But if you have a very limited, very quick use case where you need to get into, for example, quick turnarounds and make some money, you may have a very digital artifacts that you need to get, and you can quickly get to where you need to be. I came up with a couple of examples. Rotating tools, we always want to know where they are, so you do a realtime location tracking. But then there's also something called tool life management. We want to know when the tool's actually worn and needs to be replaced. The realtime location tracking, I only need a couple of data points every couple of minutes. It's really not a large stream of data that I need for that.

Jan de Nijs:

But when I have tool life management, it becomes a complete different ballgame. I need a complete different set of the digital artifacts that help me describe the life of a tool. Those same data artifacts don't really help me with location tracking. So, if my purpose is only location tracking for my rotating tool, the amount of data artifacts is very limited. If I want to do tool life management, it's still not very big, but it's a complete different set of data artifacts. So, you got to really keep in mind what the purpose is. Why are you building this digital twin? So, I think it's impossible to say that there is The digital twin for a rotating tool. There is A digital twin specifically formulated for a purpose for the digital twin. What's your use case? You hear that a lot these days in the world here. Everybody is talking about it. What is the use case for your data? What are you trying to accomplish with it? So, which is basically going back to this whole idea of purpose. So, a digital twin needs to be defined for a purpose. If you don that's a purpose, you find that you're basically starting to gather data, and have no idea what you're going to do with it.

Jan de Nijs:

There are some people that are arguing that that's a good thing. The first ones that'll say that that's a good thing is, for example, Google. They've been gathering wilds amount of data on us without really understanding what to do with it. But what they're now trying to do, of course, what Google has been very successful at is is take all kind of data, and develop a digital twin of us, so that they can sell us more advertising. Amazon tries to sell us more goods, so all these companies have all, for years, also developing all these large, but they have a purpose for. Google is not trying to create a digital twin to make our life more healthier. That's no why they have a digital twin. They're focused. They know why they need a digital twin. They need a digital twin of you to sell you more advertising because if they sell you more ad... Sorry. They need you to look at more advertising because then they can sell more advertising. So, their purpose is selling advertising. Amazon is the digital twin of people is selling more product. They have different purposes, and therefore they collect different artifacts even though it's a digital twin of you. So therefore it is important that digital twins have a purpose because otherwise you may actually lose focus, and not know what you're doing.

Jan de Nijs:

Another very important issue is rate of synchronization. A lot of people are saying, "Well, you know, we need to get as much information as we can get," which is interesting. For example, to go back to my rotating tool. One day when I have a use case, when the use case is realtime location tracker, do I really need to know where that tool is every millisecond? So, do I really have to have a datapoint per millisecond in that specific use case, or is one datapoint per minute, or one datapoint per hour, or one datapoint per day? What is the rate of synchronization? How much data do I need to get realtime location tracking accurate enough for my specific use case? And this is a very important issue. But there are a lot of people that say, "We want to do, for example, predictive maintenance on a machine tool." That's one of their really big ones going around right now. So, we want to create a digital twin to do predictive maintenance. So, they start to gather all this data, and they say, "Well, we can't get it fast enough." The reality is is for predictive maintenance, you also don't need millisecond response time. You can typically run with 10, 15 seconds is more than enough to do actually predictive maintenance.

Jan de Nijs:

The things that are going to fail within milliseconds are the things that predictive maintenance can never pick up. Those are things you have to fix, so your machine to controller, but predictive maintenance is looking at trends that are going on over hours, days, months, and I tried to predict when the best time is when you need to do service to your machine. So again, how much data do you really need for your specific use case, and then basically you've got to tailor the amount of data that you're getting.

Jan de Nijs:

Cut otherwise you're going to be awash in data, and not really know what to do with it. You can say like, "Well, what's the hurt there?" Well, the problem with that is is that you're actually your data analysts have problems with this. They can be overwhelmed with just data that they basically don't know what to do with it. And so you want to be real with the amount of data that you're going to feed. If your digital twin is something else that does need this really large amount of data, then you'll have to figure out how to supply that again to your data analysis. There are definitely use cases where you want to have a lot of data going on very, very quickly such as AR/VR. The virtual reality goggles, they need a lot of data going really fast back and forth, and I get that. There are definitely use cases where rate of synchronization is completely different than for others, and that's why you see Edge compute being coming on. You want to probably keep that very, very, very close to the devices. You don't want to send this to data lakes, but you want to keep it close to the device.

Jan de Nijs:

Edge analytics, Edge tools, they're all depend on these rates of synchronization. If you need real, real time response to large amount of data, you can't go to your data lakes first. You got to stay close. So, these are all things that you have to consider when you're talking about rate of synchronization. You don't want to just drown yourself in a lot of fairly useless data. You want to make sure that you have the right amount of data. And these are all things you ought to think about upfront. You shouldn't say, "We're going to make a digital twin of manufacturing." You hear that a lot. That was one of the first things they came asked me. They said, "Jan, you're going to be responsible for The production digital twin." And we quickly came to the realization that you can't really say that. You need to have these definitions in place. You really need to know specifically what you're going to do, and go much further down in detail. There's many, many, many different digital twins that make up a production digital twin.

Jan de Nijs:

So, give you another example, the people that know me, I'm in the part manufacturing. I do manufactured parts, machining. That's my background. Bought a lot of machines, developed a lot of procecees, processed a lot of parts. So, there is a nice example in the part world. We have these parts that we make. We want to describe them digitally, and there's, of course, all kinds of artifacts that do that, specifically fracture critical parts. These are these parts that we use in aerospace that if they fail, it's catastrophic. So, these are called fracture critical parts for a reason, and they have a whole bunch of requirements on them for all these digital artifacts. We need to store almost forever the original data, the as-designed requirements, so the as-designed engineering requirements. There are a lot of other artifacts that are related to that. The finite element analysis, simulations, FMEAs, the failure mode analysis, all kind of things that go into that.

Jan de Nijs:

Whereby, there are, certainly from our customer, has lots of requirements for saving that do, so that they can do eventually in the future postmortems, these things, once every now and then, one that has a mishap, and we want to go really down in the data artifacts why it was taking that shape. And you need to be able to find those things back for a long time to come. In aerospace, I don't know whether... I think everybody knows the C130, right? The Hercules transport plane. The first one flew in 1957, so we're now 60 years into that program, and we're still building them at a very high rate. And there is no end in site. We may making these for another 20, 30 years, so these programs, these manufacturing programs, they're not 5 or 10 year programs like your iPhone. We're dealing in aerospace. We're dealing with programs that are 50, 60, 70 years old. I think C130 will be there long after I retire, so it's a great money make, of course, for us, as a corporation because those are wonderful projects to have. And we're very lucky to have a customer that's so interested in these project, but it's a challenge.

Jan de Nijs:

For some of these projects, how do you keep all that data for so long, so that you can actually find out what happens when you have a mishap? But that is for the original data. But then basically what you have is of course also serialized data, which is the instance of your engineering requirements. The instances of the model is what you actually have in front of you. Right in front of you in the picture on the right there, you have two of them. But for those specific critical parts, there are many, many artifacts that we have to save. There are material search. We do inspection on them. All these inspections result in QIF files for example, QIF being the standard. That's now an ISO standard that's being developed for inspection. We need to also store how we actually process this, so the as-processed digital twin as some people call them. So, the as-processed data, which is basically the MTConnect data. Again, MTConnect, the standard developed by AMT here. So, MTConnect becomes extremely important to save in case of these types of elements. And then there's, of course, all the non-destructive inspection that actually goes on with these products.

Jan de Nijs:

All that stuff, all these other artifacts, they need to become available, so that when the customer has a mishap, they want to do a deep dive. What happened that they actually can go in there, and have all that data, so that you can unleash some analytics that actually automatically looks at that data, and can help you pinpoint what happened with this specific part. We just had 777s that were Pratt & Whitney engines that were shedding some blades. We want to go in there, find out exactly for that blade how it was manufacturing, what temperatures it saw during manufacturing, and find out what actually happened to that blade, why did it fail. So, these are things that our customers are very, very interested in for these specific parts. So, the point I want to make is there are these specific manufactured parts that are going to have a lot of artifacts. But of course, there's also these parts, not so much. I got a lot of bolts. I screw a lot of things together with just regular bolts that come out of a bin. Do I need all that data back to the source for all the bolts that are holding up a bracket? A bracket that just holds a tire up? I don't think so.

Jan de Nijs:

We really don't need that. We may want to have the original engineering requirements there as-designed, because that basically describes how it fits into the assembly. That's what's showing up, and that, for most applications for regular bolts, is all we need. So, the data, you ask the master to send you treaty file, and they send you a step file, and now you just file. You use that in your design, and that's usually more than enough. So, can I, for every part that goes into my airplane, can I say that I can define for every part the same digital twin requirements as for other parts? And the answer is no. I just can't. I have all these different requirements on these parts, and depending on the requirements on the parts, the digital twin has to be defined. And so we, as a corporation, we're struggling with that. We don't have the answer yet, but we're starting to realize that we actually have to have requirements developed that we're going to have to flow down to our supply chain to make sure that we actually get these artifacts together with the parts.

Jan de Nijs:

Another use case here that I want to bring up is one where we do process. You always, in our case for fracture critical parts, for many times we need to prove that we actually executed per a certain process. We want to be able to visualize the process even 50 years on. Hopefully in the future, we want to really in the postmortems, and all these deep dives that we have to do after mishaps, we want to be able to prove that we, the vendor, created the part to the nominal process, so yeah. We need to keep that data for a very, very long time. So, we want to have the original information of course, which is the asset planned process. So, this is the way I laid it out, the standards that are going to be very, very important there are the step AP238 standard, which helps you lay out the process, the QAF inspection plan. That basically lays out the inspection potion of that. And then of course there are the nominal data for your machinery, your fixturing, all that stuff. It needs to be available, so that you can actually use it to create the AR/VR experiences in the future.

Jan de Nijs:

But then there's, of course, specifically for that serial number, there's digital artifacts. There's the MTConnect file for when the part actually was produced, which we call the as-processed digital twin. And there's the proof we're looking at, okay. How do you prove that we actually have done a manual operation? Let's say there is a manual operation developed for this specific part. Believe or not we do that still quite a bit because our volume isn't super high. So, many times it doesn't really payback to automate everything, so there's a manual operation. For example, manual inspection, somebody goes out with a caliper, measures something. Well, how do you prove that actually it got measured, that the operator didn't invent a number here? So, we're starting to use more and more of these local video clips where you have your caliper, and you basically video it. These types of artifacts are all going to be now stored with the process digital twin for later on proving that you actually followed a process while making this part. So, these are data artifacts that are specifically for certain visualization or future analysis requirements.

Jan de Nijs:

So, coming to some conclusions here. It is important for digital twins to be specific, to have your use case, to actually have a purpose for all this. The amount of data, the rate if synchronization needs to match your digital twin needs. There are other related artifacts with the digital twins. What's an important point, which I may have actually missed earlier is that when we're going to define digital twins for specific use cases, we're also going to flow those down to our supply chain, whether it's internally or externally. In the future, it shouldn't matter whether the part came from internally or externally. So, if I actually buy a part from an outside source, or I have a part that's actually produced by my in-house capabilities, when the part is delivered, of course it needs to be the same. I shouldn't be able to make a difference between the two, but also my digital artifacts should also be the same. So, the future of ordering parts from suppliers will be that there will be a requirement to actually deliver the part, and there will be a requirement to actually delivering with digital artifacts. So, there's just going to be two portions of a manufactured part delivery. One is the actual part, the steel, the aluminum, whatever it is, and the other part is going to be standardized digital artifacts that describe that.

Jan de Nijs:

For example, a QIF file, for example, an MTConnect file, all these things are going to be required to deliver. In the future, the delivery is not finished until I have both all my parts, all my digital artifacts. Yes, we as tier one suppliers to our customers don't have many of these standards developed internally yet. We don't know how to do that yet because that is not something that we've done in the past, and so we're developing these things. This is still ongoing. We're not there yet. Anyway, so that's third point there. We need to have the same amount of data in the same format regardless of whether the part's made in-house or externally. This is, of course, customer driven for us, and this is the customer is very much wanting us to adopt standards, so step MTConnect, QIF. In the electronics world, there's a whole series of standards called IPC. In the facilities world, to go back to your plant, there's standards called haystack. These will all become very, very important because they offer a way to actually towards long-term archival and retrieval, LOTAR, which is also an effort currently going on within the step world.

Jan de Nijs:

We need to have these data artifacts available to us for many, many decades to come, and LOTAR is an effort by many of the aerospace companies to actually start to address all that. So, if that, I think I'm at the end of my presentation. So, if Steven, or whoever is in the background, I hope somebody followed the questions. I hope there's questions because otherwise we're going to be able to move quickly to the next speaker.

Stephen LaMarca:

No worries, Jan. We've got plenty of good questions, and as-

Jan de Nijs:

Cool. I'll-

Stephen LaMarca:

... we did on Monday, I'm actually going to kick us off with my more remedial questions. But I'd like to start with my key lesson learned from you, at least the beginning of your presentation. Key lesson learned that I took away is there's no The digital twin. You have A digital twin and that's what you're going for. There's no it covers everything that everybody has to do with anything digital twin. You want to specialize your digital twins, and you need limits to digital twin. And I think what you're getting at with that is the same kind of... There's a risk to the digital twin that is the same kind of risk that you get when you're implementing big data, or any sort of communication standard like that of MTConnect. It's that you don't want to collect too much data. You don't want too much information, or that's going to back things up. But where I was going with this is my is, is there a minimum requirement for something to be considered a digital twin? So, what I'm getting at is today with CAD models, CAD models can get really in detail, really detail oriented with the part that you're designing to the point where you can do all of your computational fluid dynamics, and all of your simulation, and digital stress testing on the part that you've designed, and your computer aided manufacturing that you can potentially manufacture.

Stephen LaMarca:

You do all your simulations digitally before you actually produce the part, but in some cases that's not considered a digital twin. What is the minimum requirement for something that's just a really well-developed, digital model to becoming a digital twin?

Jan de Nijs:

Yeah. We call that actually the original engineering that actually goes into these parts. We call that actually right now we're starting to call that the virtual prototype.

Stephen LaMarca:

Okay.

Jan de Nijs:

Yes, there is, again this is a debate that's raging in the industry, what to call all the things that you're alluding to that's happening in engineering. There's very important work happening in engineering, and the engineering folk like to call it the digital twin. And this is a big point of contention within the industry. That's why I started off with the definitions. For the purpose of this speech, we want twins to come in twos. There will need to be two. This picture here that I have on the screen here, you have to the human being and you have the digital description of the human being. There's a real observable element, and you have the data that describes it. Now, what engineering is doing when they developed the prototype is incredibly important. Our industry is led by fly the twin. They call it a digital twin. Like I said, I'm not agreeing with that because when you're actually developing this there's nothing out there yet. And what is your number two? Where is your number two? There is no number two.

Jan de Nijs:

Well, they say like, "Well, in the future there will be a lot of number two, threes, and fours." Okay. I get that, but there is no real one to one relationship. You can't say that the thing that we fly early on in engineering actually already has something that exists in reality. Now they're going to say, "Well, there's prototypes, and there's little test stands that are, and they're feeding," I get that. But that is not truly a description of another thing. So, what's happening in engineering, we like to now say that's the world of the digital prototyping. Here's where you fly this digital prototype, where we put a lot of hours on this digital prototype, and it's all incredible, important work. But for the purposes of digital twinning, in theory we do not call those things digital twins anymore. We call them the digital prototype. And what's happening then is is we get... Don't get me wrong. There's a lot to be learned from the digital prototype, and that then basically gets into nominal data that we actually manufacture something from. So now you get these step models that we use in manufacturing to actually start to make all these copies if you will.

Jan de Nijs:

We start to copies, and there's a real part, and there's a nominal part, and you compare the two. So, there has to be a, in our world, we believe that there has to be something physical. And then there's these digital artifacts that describe that physical thing. And that is part, yes. I'm wading into a very deep discussion here because the engineering world tends to see this as an attack on their world. By not calling this thing that they fly on the screen, they have it on the screen and it flies, and the government thinks it's all very important. Our customer really wants us to use a lot of that. And if we don't call it the digital twin, they seem to be slighted. They have this emotional response to that, but it seems that other people start to realize that, no. What's happening in the engineering world is not truly a twin. It doesn't really describe an observable element. It describes the wish for an observable element in the future.

Stephen LaMarca:

Got you.

Jan de Nijs:

So, by calling it a digital, what the engineering world is doing by calling it a digital prototype, we're starting to basically distance ourselves from that by digital twins need a physical cyber, and a physical. And the digital prototype is what actually is being simulated, and flown, and whatever.

Stephen LaMarca:

Sure. All right. So, starting with some of our questions that came in, our first one was do you need all this data in one system, or could it be in a couple if you don't link them together?

Jan de Nijs:

Okay. That's another talk. That is one of my absolute favorite. Thank you for this question. This is one of my others, but that will require... Yes, absolutely. That is a very, very, very good point. Lots of this data, and this is very important in our business, lots of this data has governance on it. And governance is a really big field where, okay. So, you got data. Who can actually look at it? I have classified data. I have top secret, secret, all kind of levels of classifications go with this data. And it all depends on what bucket you throw this into. So, a lot of people are saying like, "Oh well, manufacturing data, right? We got to dump it on our MES system." That has proven to be a major mistake. MES is basically to orchestrate by manufacturing. Basically what it does is it takes your MES, manufacturing execution systems, take your process, and then they orchestrate it how to pull it through your physical factory floor, all the machines that you have. But then, okay, so does all this data, like QIF data, MTConnect data, does it belong in your MES system?

Jan de Nijs:

And I give you that the answer is no, absolutely not. MTConnect data needs its own governance. So, what you want to have is you want to have your own data source for your, for example, your MTConnect data. You want to have your own data source for your QIF data. You want to have your own data source for your IPC data. Now, these data sources these days are very easy to set up. There's a couple of real big companies that are helping us with storing data. So, to set up these independent data stores, these big areas where you can store massive amount of data, that's not difficult anymore. So, now we are actually starting to define these data stores based on governance. MTConnect data has different governance than QIF. QIF is inspection. Inspection you may want to keep for a long time. MTConnect data, maybe not so, it all depends. But by creating different store for those, you put them in there. Now, what's happening is is if you just dump data in there that has absolutely no traceability, that you cannot trace to your serial number, and even worse, that you cannot trace down all the way to the feature, actually what you were making, then that data becomes unusable.

Jan de Nijs:

So, okay. So, it may be sitting in its own data store, but if the data is not properly marked, doesn't have all the proper links back to the model, to the model requirements, it still becomes useless data. So, you need to make sure that your data is just not a datapoint, like tag number ABC is 15. Okay. Well, what does that mean? So, MTConnect provides a level of meaning to these things. And then basically QIF can provide you with an inspection of all those numbers, how well you actually did that. But those are all different things. They belong in different data stores. They have different governance on them, but they, in the end, all need to be reflected back to the engineering requirements. Yeah, that is a really important field where we're going to... How do you link all these disparate datas that are flying out there? All these data elements, how do you link them back to model requirements? Again, I love to get into that, but that's a real different because that requires another hour of speech here, and I can do that. But it's probably not be helpful at this point, right?

Stephen LaMarca:

Right. Well, I think a follow-on question to that first one is can part of your data be in your PLM and also your MES system, or do they need to be in one big system?

Jan de Nijs:

Okay. This is Jan's opinion, and I must admit within Lockheed this is a raging battle. It cannot be in PLM. it cannot be in MES, okay?

Stephen LaMarca:

Okay.

Jan de Nijs:

Jan's opinion is is that that's a very bad idea. But I must admit there are some IT people that are battling me tooth and nail, and I may lose this battle still, but I do not believe, I passionately do not believe, that this type of data belongs in any of these systems. And purely because of the reason that the governance on these systems is different. And the PLM system has different governance than a store for MTConnect data. So, you're storing. What you're doing is is you got PLM, you got a basket of apples. And you're throwing in oranges, and you're saying like, "The life of the orange is the same as the life of the apples." And we all know that that's not the case. And so you can't do that. So, I'm passionately against that, but I must admit that that's not the Lockheed Martin way. We are in the midst of really debating that, and trying to figure out where to store all this data. And like I said, I think I'm starting to get more and more converts coming with me that are saying like, "No, this really needs to be separate data lakes." So, you store this like data, you store in a separate data lake, so that you can have better governance over it.

Stephen LaMarca:

Got you. Rebecca, I feel like you're itching to ask something.

Rebecca Kurfess:

I am. I actually had a question on the opposite end of the spectrum of your question about the minimum requirement for a digital twin. I know you highly recommend limiting the scope of your digital twins, and narrowing that focus, but you also said that some of your projects are 20, 30 years long. Do you see any value in collecting data that you don't quite have a use for yet, but that could potentially be useful in analysis developed in the future?

Jan de Nijs:

Yeah, that's the case. That's absolutely true. That's absolutely true. Yeah, we cannot envision every use case that we have today, but that should not takeaway that because this is a balance here. What is it that you really want to accomplish? If you collect so much data that you can't figure out what you're doing with it, it becomes really just a cost for no reason. Yeah. I see your point. Some people are arguing, just gather all the data you can, and just dump it in a data lake. And there may be something good for that, good to that, but there are use cases that are going to be developed in the future. But what I found until now, and we've all been doing this only for 5 to 10 years, right? But what I've been finding is is that when you really get to it, you find that actually you have been looking at the wrong data. You come up with this new analysis thing that you didn't think of 10 years ago, and now you're going to look at all your data that you have there. And you find out, you know what? The right data isn't there.

Jan de Nijs:

Even though I thought it was. Even though I was thinking that I was having a lot of data that was coming down, it's not enough, certain tags. Because right now, for example, our industry is just very much driven by a prescriptive and predictive maintenance. A lot of reasons we're monitoring this is because of maintenance. But now that the digital twinning is coming, we look at the maintenance data, and we can't do active digital twinning. All the basic requirements for digital twinning, we don't have the data for it. So, we're going to go back, and trying to get more data added to these data streams. So, was that possible? A machine tool has about a million tags. Should I try to capture those million tags because I might have in the future an application for it? My contention is no. Let's limit ourselves to why are we doing this, and get all the tags that we need.

Stephen LaMarca:

Got you.

Jan de Nijs:

If that new analysis tool comes out, okay. Then let's look at it again, and let's start to add these data tags to it. If I have to choose out of the million tags, if I have to start to pick the top 500, then yes. But I can't do all million of them. It's just not going to happen, and we don't have enough storage. It's very expensive. So, I think the amount of data you get should be use case driven. That remains my opinion.

Stephen LaMarca:

Got you. Moral of the story, know what you're looking for before you start collecting data. We've got two more questions before we got to hop into the next speaker, and actually the next question comes from our next speaker. John Berg asks, says, on your comment about frequency of data collection, it would seem this can be guided by a change of state. So for example, if a tool was on a machine tool, I don't need to check its location unless it's taken off the assigned machine. Am I thinking correctly?

Jan de Nijs:

Yes. That's absolutely true. That is another way to look at that, but again, you may be tracking it. And then when a change of state happens, that's when you do that. This is an active discussion. For example, in a standards committee like MTConnect, we're starting now to get into those issues. How do you track these observable elements for your production floor? State of change may be one of the ways we're going to do that.

Stephen LaMarca:

And last one, what are your thoughts on polling frequency for IIOT devices in production environments.

Jan de Nijs:

Yeah again, that's the same issue. What's your use case? Give me your use case, and I'll tell you what the polling needs to be. A lot of people are saying when you do prescriptive maintenance, "OH, I need to have every millisecond. I need to have," and I'm saying no. You don't, but if you have AR/VR, then a millisecond, that's too short. That's too long. We need to even be better than that. So, to me, it's like what is your use case, and what are you trying to accomplish? A lot of people say like that IOT should have safety implications. I should capture safety events. I need to stop the machine before it disintegrate itself, or before it kills a person. And I say that is an absolutely flawed philosophy. If you rely on data to go from a machine tool first to a data lake, to an analysis software, and then data coming back to your machine to stop a machine, and prevent damage or physical harm, you're on the wrong track. Safety needs to be done on the machine tool. It needs to be in the PLC, in the motion controller, and you have to have milliseconds turnaround.

Jan de Nijs:

And so it all depends on what you want to accomplish again. What is the goal of your digital twin? That basically determines how much data, how quickly you need to poll.

Stephen LaMarca:

All right. Jan, thank you so much. Our discussion has been eye-opening as always, and a great pleasure having you on here.

Tagged

Data