FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries

**deanhystad** · Apr-22-2024, 05:58 PM

I'm not giving up on a vectored solution quite yet, but a solution that is better than using concat is to use your logic to compute a "late_prints" column. You still pay a looping time penalty, but you avoid a larger penalty for concatenating frames a row at a time.

sawtooth500 · Apr-22-2024, 06:12 PM

Thanks, yeah that's a good point using concat instead of _append. If you can think of a vector solution, I'd love that. The fact that I can have an unknown number of rows that are "late prints" is what is really throwing me off from a vector solution here...

**deanhystad** · Apr-22-2024, 09:51 PM

Quote: that's a good point using concat instead

That not what I was trying to say. concat is better than append, but both should be used sparingly. What I was trying to say is that I would use boolean indexing to make the new dataframe, and I would use your late prints identifier to create the boolean list. Maybe I could vectorize some of that process.

I would probably start with a shift of price and time. Now I can compute a change rate (price - shifted_price) / (time - shifted_time). If I see a rapid change, I start marking data rows as suspect. I stop suspecting the data when I see a shift in the opposite direction.

FortuneCoins · Apr-23-2024, 07:42 PM

(Apr-22-2024, 03:00 PM)sawtooth500 Wrote: This provides more info on late prints if you are curious.

https://www.youtube.com/watch?v=OZrMMOHiUeo

Although off topic, but a very useful video

sawtooth500 · Apr-24-2024, 01:42 AM

(Apr-22-2024, 09:51 PM)deanhystad Wrote:
Quote: that's a good point using concat instead
That not what I was trying to say. concat is better than append, but both should be used sparingly. What I was trying to say is that I would use boolean indexing to make the new dataframe, and I would use your late prints identifier to create the boolean list. Maybe I could vectorize some of that process.

I would probably start with a shift of price and time. Now I can compute a change rate (price - shifted_price) / (time - shifted_time). If I see a rapid change, I start marking data rows as suspect. I stop suspecting the data when I see a shift in the opposite direction.

That's more or less the approach I'm taking. Finding the start of late prints is very easy actually - I just shift the dataframe by one row, and if it's beyond a certain pricedelta that's the start. The trick is finding the end of it. Consider the following example price action, let's set out pricedelta to be 1.00

180.01
180.01
180.02
180.02
180.03
180.01
181.32 LATE PRINT
181.31 LATE PRINT
181.32 LATE PRINT
180.08
180.08
180.09
180.10
180.19

I feel like once I ID the start of a late sequence, I need to use a for loop because I don't know how many there will be before it goes back to "normal" - there are 3 late in this case, but I've seen as many as 13 late in a row, but again 13 should not be considered an upper limit. If there is no upper limit for a the number of bad prints in a row, I don't know how to use a shift to find end since that requires knowing by how many rows to shift.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries i	sawtooth500	3	2,023	Mar-22-2024, 03:08 AM Last Post: deanhystad
	String concatenation in SQL update statement	hammer	3	1,608	Feb-24-2022, 08:00 PM Last Post: hammer
	f string concatenation problem	growSeb	3	2,327	Jun-28-2021, 05:00 AM Last Post: buran
	Concatenation ??	ridgerunnersjw	1	1,770	Sep-26-2020, 07:29 PM Last Post: deanhystad
	FutureWarning: pandas.util.testing is deprecated	buunaanaa	3	5,186	May-17-2020, 07:43 AM Last Post: snippsat
	Combining two strings together (not concatenation)	DreamingInsanity	6	3,216	Mar-29-2019, 04:32 PM Last Post: DreamingInsanity
	Handling null or empty entries from Entry Widget	KevinBrown	1	2,358	Mar-17-2019, 04:22 PM Last Post: perfringo
	append elements into the empty dataframe	jazzy	0	2,151	Sep-26-2018, 07:26 AM Last Post: jazzy
	Regarding concatenation of a list to a string	Kedar	2	22,863	Aug-19-2018, 12:57 PM Last Post: ichabod801

FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries

User Panel Messages

Announcements