Add new additional data field
Closed
Add new additional data field
-
Updatewater_current_velocity
https://git.noc.ac.uk/communications-backbone-system/backbone-message-format/-/blob/67-mas-dt/formats/platform_status.py#L103 -
Add new field
additional_data
changed milestone to %CB-2024W41
Ben's email from last week requests more than just
water_current_velocity
:From: Benjamin Allsup
Sent: 09 October 2024 15:10
To: Owain Jones owain.jones@noc.ac.uk
Cc: Ashley Morris ashley.morris@noc.ac.uk
Subject: MAS-DT Slocum->ORIHello Owain,
After yesterdays meetings I think we need to do the following:
From the glider surface dialog the following should be sent to ORI as soon as possilbe on each surfacing. I would include timestamp as well. Are you doing that already converting seconds ago to the time stamp of the data?
GPS Location: 3837.595 N -7056.211 E measured 3.408 secs ago sensor:c_wpt_lat(lat)=3838.3 3991.34 secs ago sensor:c_wpt_lon(lon)=-7052.8 3991.36 secs ago sensor:m_coulomb_amphr_total(amp-hrs)=0 1e+308 secs ago sensor:m_lithium_battery_relative_charge(%)=0 1e+308 secs ago sensor:m_surf_water_vel_mag(m/s)=0.274028470326962 158.805 secs ago sensor:m_surf_water_vx(m/s)=-0.0493392586711931 158.821 secs ago sensor:m_surf_water_vy(m/s)=-0.269550069752378 158.834 secs ago sensor:m_water_vx(m/s)=0.0469179260102644 158.872 secs ago sensor:m_water_vy(m/s)=-0.0540481688056459
-
m_coulomb_amphr_total(amp-hrs)
- total amount of battery energy used; we don't have a field for that -
m_water_
,m_surf_water_
,m_final_water_
: Have asked Ben what the difference is between these three. Masterdata doesn't really explain it. -
m_surf_water_vel_mag(m/s)
: Surface water velocity magnitude?? - Glider reports velocities as an x/y vector - that
water_current_velocity
field can be converted to a speed(m/s) and an angle (degrees)?
-
- sensor:c_wpt_lat(lat)=3838.3 3991.34 secs ago
- sensor:c_wpt_lon(lon)=-7052.8 3991.36 secs ago
We do have the concept of the
payload.platform_timestamp
andheader.timestamp
to account for the difference between time measured and time sent so I don't think we should need to change anything in the message format for that although like Ben said we should maybe be removing the secs ago from the timestamp. Although since this is different for different fields in the status that's an interesting question. I guess we could have an optionalpayload.position_timestamp
to represent the case where the GPS time was significantly older than the other variable data but this seems awkward.- sensor:m_coulomb_amphr_total(amp-hrs)=0 1e+308 secs ago
- sensor:m_lithium_battery_relative_charge(%)=0 1e+308 secs ago
I don't think this works in the general case. I think what we need is the total battery capacity at 100%. The problem with remaining % + capacity used is that we don't always start a mission at 100%. From the initial capacity and the remaining % you can calculate everything else. If you started at 80% and then reported 79% with 1 amp hour used then that would suggest that 1 amp hour = 21% rather than 1 amp hour = 1%.
- sensor:m_surf_water_vel_mag(m/s)=0.274028470326962 158.805 secs ago
- sensor:m_surf_water_vx(m/s)=-0.0493392586711931 158.821 secs ago
- sensor:m_surf_water_vy(m/s)=-0.269550069752378 158.834 secs ago
- sensor:m_water_vx(m/s)=0.0469179260102644 158.872 secs ago
- sensor:m_water_vy(m/s)=-0.0540481688056459
I don't think the present
water_current_velocity
is being used. If it is it would be by hydrosurv and I would expect Simon would a) have questioned it and b) be happy to change it. I think I can check this in the audit logs from SoAR so I'll have a look at those and update here.Looks like the
water_current_velocity
was a float until this MR which explicitly requires that change to a string with the compass direction appended but doesn't say why.The word velocity requires direction (without direction it's speed) so a pure float doesn't make sense with that name. It seems like the velocity decomposed into x and y components is common although conventionally referred to as U(x) and V(y) for some reason. I wonder if we should follow that convention so we have something like:
derived_water_velocity_surface_u
derived_water_velocity_surface_v
derived_water_velocity_average_u
derived_water_velocity_average_v
We could do the magnitude and bearing but U,V seems to be more common.
From this you could remove the surface drift from the average current to get the sub-surface average current or the discrepancy between the dead-reckoned and GPS surface location.
I think my problem with this is the
platform_status
says what is the state of the platform now. The water velocity data is the calculated difference between 2 positions so it feels to me like it's putting data representing 2 very different concepts into 1 message. I wonder if what we should send is the dead-reckoned position and the GPS position which would allow any stateful shore-side software to make these calculations as required.We could use the schemaless
additional_data
type model to pass platform specific data but I think for integral things like batteries and water velocity it would be better to come up with a genericised model for the backbone to maintain the platform agnostic approach wherever possible. However, we could do both. It might make it easier to train a model if the original variable names are preserved so that the mission data matches the training data.I guess this needs a wider conversation with BODC. If we want to make a genericised model then it would be good for BODC to publish the genericised data which would then mean that the modeling could be trained on the genericised data.
position_timestamp
might be useful, yeah. It can be useful to know if the GPS fix time is getting old.The timestamp reported in platform_status is the glider's current time - not the GPS fix time. That's how it's always been on C2. The time of position could be off by a matter of minutes to half an hour (although if it's drifting on surface,
callback 5
to get a new GPS fix right). But the internal clock time will continue to increment, and I think that's important for platform_status - it's the glider telling us something at that point in time, even if the sensor data is getting stale-er. tl;dr: Don't need to change anything with this oneRE
m_coulomb_amphr_total
andm_lithium_battery_relative_charge
- I don't know if the intent was to combine the info from these two, just report them both? The % charge gets us current battery capacity, the amp-hrs used is a mileage counter so can be useful to see the absolute energy used on the previous dives (e.g. how much energy does option 1 use v option 12a, on average).The word velocity requires direction (without direction it's speed) so a pure float doesn't make sense with that name. It seems like the velocity decomposed into x and y components is common although conventionally referred to as U(x) and V(y) for some reason. I wonder if we should follow that convention so we have something like:
derived_water_velocity_surface_u
derived_water_velocity_surface_v
derived_water_velocity_average_u
derived_water_velocity_average_v
I agree with this
sticking with conventions, and means we don't have to convert slocum x/y (u/v) into velocity/angle/magnitude etc.We could use the schemaless
additional_data
type model to pass platform specific data but I think for integral things like batteries and water velocity it would be better to come up with a genericised model for the backbone to maintain the platform agnostic approach wherever possible.Agreed, normalise the data
However, we could do both. It might make it easier to train a model if the original variable names are preserved so that the mission data matches the training data.
Also agreed
This is why greatblue2 events ended up storing the same data in multiple formats in the events schema. Normalised platform-agnostic version, and then "the raw data, but converted to JSON" dumped into the
data
field. The generic blob of data payload thing is not great but maybe there's a cool way of attaching a schema to it, to at least describe the structure, for benefit of whomever is consuming the messages?:How about a required (but nullable)
$schema
field in the structure? That way people are made aware that if they provide additional data they have to pay attention to$schema
and either intentionally make it null, or provide one.I don't know if you would / if it's worth using that for validation in CB, but it can serve useful for the clients... maybe.
{ "additional_data": { "$schema": null, "arbitrary": "value", "foo": 1 } }
{ "additional_data": { "$schema": "https://path/to/my/slocum/glider/data/structure/schema.json", "sensors": { "m_surf_water_vel_mag": [0.274028470326962, 158.805] }, "devices": {...} } }
I guess this needs a wider conversation with BODC. If we want to make a genericised model then it would be good for BODC to publish the genericised data which would then mean that the modeling could be trained on the genericised data.
Yeah, sounds like this ties in with JSON-LD and use of vocabs.
How about: We do what we can with
additional_data
now, and then we have more wiggle-room and time to think about the genericised side in cahoots with Alexandra et al?
RE
m_coulomb_amphr_total
andm_lithium_battery_relative_charge
- I don't know if the intent was to combine the info from these two, just report them both? The % charge gets us current battery capacity, the amp-hrs used is a mileage counter so can be useful to see the absolute energy used on the previous dives (e.g. how much energy does option 1 use v option 12a, on average).I think my point was that you couldn't combine them if you wanted to. If you know what 100% represents in amphours then you can derive what the difference between 76.3% and 74.7% represents in amphours or do any other sums. But if you just have the number of amphours used and the remaining percentage then there are a number of calculations you can't do.
Either way you need 2 platform status messages (before and after) to calculate the delta in terms of percentage or absolute so you have to be maintaining state.
It wouldn't make sense to send the 100% capacity per
platform_status
because it wouldn't change over time so it's just wasted bandwidth. If we specified the total capacity in the planning configuration then sending both is redundant. Maybe having some redundancy in the data model is desirable as a sense-check.changed milestone to %CB-2024W43
added Weight::3 label
Updates from this afternoon's dev meeting:
- @victord says the readings Ben suggests adding to the surface dialogs (m_coulomb_amphrs_total and the current velocities) would be nice to have for future but not required for the dry-run phase -> we have time to think more about how best to represent current velocities (naming of fields, units etc)
- Justin linked us to the Ocean Gliders formats and SOPs for depth averaged currents to get a feel for existing standards/conventions: https://github.com/OceanGlidersCommunity/OG-format-user-manual?tab=readme-ov-file and https://oceangliderscommunity.github.io/DepthAverageCurrents_SOP/README.html
- Everyone in agreement that having a field like
additional_data
with the platform-specific (e.g. Slocum-centric) values is a reasonable compromise as long as we also attempt to standardize the data outside of that field. So for the meantime we can start with adding all the Slocum sensor readings toadditional_data
and work on "give them standard field names and standard units on the outer platform_status" secondary.- (That would be a really easy thing for me to implement on the sfmc-adapter side as I already wrote the code to parse all sensor readings available in a Slocum surface dialog so it'll only be a few lines of code to add those to a platform_status message in an additional_data dictionary)
Edited by Owain JonesAnd from today!
- The glider surface reason could be useful to @victord, it won't be in the first platform status output after surfacing but will be for the subsequent ones.
- It looks like this in the surface dialog:
Because:pitch not commanded [behavior surface_4 start_when = 2.0]
- Code's already there in sfmc-adapter to parse out the string after
Because:
and @trishna suggests we could stuff it in theplatform_state
field of the status message, if that is a sensible place? (Is "reason the glider surfaced" semantically the same as the ALR-centric "current platform state")?
- It looks like this in the surface dialog:
- The glider surface reason could be useful to @victord, it won't be in the first platform status output after surfacing but will be for the subsequent ones.
Gonna document what those water velocity calculations Ben linked mean so I don't forget again.
This is what "final" means (or what I think their maths is doing at least). It's adjusting for surface drift between surfacing and getting the first GPS fix after the 2nd GPS fix assuming that the surface drift is constant.
- Glider surfaces at A
- Glider drifts for X seconds
- Glider gets GPS fix at B
- Glider drifts for Y seconds
- Glider gets GPS fix at C
Diff between dead-reckoned position and B is used to calculated the average water current for the dive (
m_water_v[x|y]
)Then the line from C to B is extended for X seconds to determine the "real" surface location of A
Diff between dead-reckoned position and A is used to calculate final (
m_final_water_v[x|y]
)Diff between B and C is used to calculate surface current velocity (
m_surf_water_v[x|y]
)Edited by Dan JonesNot sure how any of that ^ accounts for the roll of the glider. If the glider is steering to combat the current then presumably it's dead-reckoned position must be ignoring that and still assuming it's going in a straight line. Otherwise none of that ^ makes any sense at all.
Edited by Dan Joneschanged milestone to %CB-2024W47
mentioned in merge request !54 (merged)
added Status::In Review label
assigned to @trishna