Tracking unused redux data in React components

Tracking unused redux data in React components - reactjs

I'm looking for a good way to track which props received by a component are not being used and can be safely removed.
In a system I maintain, our client single-page app fetches a large amount of data from some private endpoints in our backend services via redux saga. For most endpoints called, all data received is passed directly to our React components, no filtering applied. We are working to improve the overall system performance, and part of that process involves reducing the amount of data returned by our backend-for-frontend services, given those themselves call a large number of services to compose the returned JSON data, which adds to the overall response time.
Ideally, we want to make sure we only fetch the data we absolutely need and save the server from doing unnecessary calls and data normalization. So far, we've been trimming the backend services data by doing a code inspection; we inspect the returned data for each endpoint, then inspect the front-end code and finally remove the data we identified (as a best guess) as unused. That's proven to be risky and inefficient, frequently we assume some data is unused, then months later find a corner case in which it was actually needed, and have to reverse the work. I'm looking for a smart, automated way to identify unused props in my app. Has anyone else had to work on something like that before? Ideas?

There's an existing library called https://github.com/aholachek/redux-usage-report , which wraps the Redux state in a proxy to identify which pieces of state are actually being used.
That may be sufficiently similar to what you're doing to be helpful, or at least give you some ideas that you can take inspiration from.

Related

The better way to build a component displaying the states of multiple devices using React

This bounty has ended. Answers to this question are eligible for a +50 reputation bounty. Bounty grace period ends in 16 hours.
Evgen is looking for a canonical answer.
Suppose I need to make a component (using React) that displays a devices state table that displays various device attributes such as their IP address, name, etc. Polling each device takes a few seconds, so you need to display a loading indicator for each specific device in the table.
I have several ways to make a similar component:
On the server side, I can create an API endpoint (like GET /device_info) that returns data for only one requested device and make multiple parallel requests from fontend to that endpoint for each device.
I can create an API endpoint (like GET /devices_info) that returns data for the entire list of devices at once on the server and make one request to it at once for the entire list of devices from frontend.
Each method has its pros and cons:
Way one:
Prons:
Easy to make. We make a "table row" component that requests data only for the device whose data it displays. The "device table" component will consist of several "table row" components that execute multiple queries in parallel each for their own device. This is especially true if you are using libraries such as React Query or RTK Query which support this behavior out of the box;
Cons:
Many requests to the endpoint, possibly hundreds (although the number of parallel requests can be limited);
If, for some reason, synchronization of access to some shared resource on the server side is required and the server supports several workers, then synchronization between workers can be very difficult to achieve, especially if each worker is a separate process;
Way two:
Prons:
One request to the endpoint;
There are no problems with access to some shared resource, everything is controlled within a single request on the server (because guaranteed there will be one worker for the request on the server side);
Cons:
It's hard to make. Since one request essentially has several intermediate states, since polling different devices takes different times, you need to make periodic requests from the UI to get an updated state, and you also need to support an interface that will support several states for each device such as: "not pulled", "in progress", "done";
With this in mind, my questions are:
What is the better way to make described component?
Does the better way have a name? Maybe it's some kind of pattern?
Maybe you know a great book/article/post that describes a solution to a similar problem?

that displays a devices state
The component asking for a device state is so... 2010?
If your device knows its state, then have your device send its state to the component
SSE - Server Sent Events, and the EventSource API
https://developer.mozilla.org/.../API/Server-sent_events
https://developer.mozilla.org/.../API/EventSource
PS. React is your choice; I would go Native JavaScript Web Components, so you have ZERO dependencies on Frameworks or Libraries for the next 30 JavaScript years
Many moons ago, I created a dashboard with PHP backend and Web Components front-end WITHOUT SSE: https://github.com/Danny-Engelman/ITpings
(no longer maintained)

Here is a general outline of how this approach might work:
On the server side, create an API endpoint that returns data for all devices in the table.
When the component is initially loaded, make a request to this endpoint to get the initial data for all devices.
Use this data to render the table, but show a loading indicator for each device that has not yet been polled.
Use a client-side timer to periodically poll each device that is still in a loading state.
When the data for a device is returned, update the table with the new information and remove the loading indicator.
This approach minimizes the number of API requests by polling devices only when necessary, while still providing a responsive user interface that updates in real-time.
As for a name or pattern for this approach, it could be considered a form of progressive enhancement or lazy loading, where the initial data is loaded on the server-side and additional data is loaded on-demand as needed.
Both ways of making the component have their pros and cons, and the choice ultimately depends on the specific requirements and constraints of the project. However, here are some additional points to consider:
Way one:
This approach is known as "rendering by fetching" or "server-driven UI", where the server is responsible for providing the data needed to render the UI. It's a common pattern in modern web development, especially with the rise of GraphQL and serverless architectures.
The main advantage of this approach is its simplicity and modularity. Each "table row" component is responsible for fetching and displaying data for its own device, which makes it easy to reason about and test. It also allows for fine-grained caching and error handling at the component level.
The main disadvantage is the potential for network congestion and server overload, especially if there are a large number of devices to display. This can be mitigated by implementing server-side throttling and client-side caching, but it adds additional complexity.
Way two:
This approach is known as "rendering by rendering" or "client-driven UI", where the client is responsible for driving the rendering logic based on the available data. It's a more traditional approach that relies on client-side JavaScript to manipulate the DOM and update the UI.
The main advantage of this approach is its efficiency and scalability. With only one request to the server, there's less network overhead and server load. It also allows for more granular control over the UI state and transitions, which can be useful for complex interactions.
The main disadvantage is its complexity and brittleness. Managing the UI state and transitions can be difficult, especially when dealing with asynchronous data fetching and error handling. It also requires more client-side JavaScript and DOM manipulation, which can slow down the UI and increase the risk of bugs and performance issues.
In summary, both approaches have their trade-offs and should be evaluated based on the specific needs of the project. There's no one-size-fits-all solution or pattern, but there are several best practices and libraries that can help simplify the implementation and improve the user experience. Some resources to explore include:
React Query and RTK Query for data fetching and caching in React.
Suspense and Concurrent Mode for asynchronous rendering and data loading in React.
GraphQL and Apollo for server-driven data fetching and caching.
Redux and MobX for state management and data flow in React.
Progressive Web Apps (PWAs) and Service Workers for offline-first and resilient web applications.

Both approaches have their own advantages and disadvantages, and the best approach depends on the specific requirements and constraints of your project. However, given that polling each device takes a few seconds and you need to display a loading indicator for each specific device in the table, the first approach (making multiple parallel requests from frontend to an API endpoint that returns data for only one requested device) seems more suitable. This approach allows you to display a loading indicator for each specific device in the table and update each row independently as soon as the data for that device becomes available.
This approach is commonly known as "concurrent data fetching" or "parallel data fetching", and it is supported by many modern front-end libraries and frameworks, such as React Query and RTK Query. These libraries allow you to easily make multiple parallel requests and manage the caching and synchronization of the data.
To implement this approach, you can create a "table row" component that requests data only for the device whose data it displays, and the "device table" component will consist of several "table row" components that execute multiple queries in parallel each for their own device. You can also limit the number of parallel requests to avoid overloading the server.
To learn more about concurrent data fetching and its implementation using React Query or RTK Query, you can refer to their official documentation and tutorials. You can also find many articles and blog posts on this topic by searching for "concurrent data fetching" or "parallel data fetching" in Google or other search engines.

Browser: How to cache large data yet enable small parts to be updated?

I have a list of 20k employees to display in a React table. When the admin user changes one, I want the change reflected in the table - even if she does a reload - but I don't want to re-fetch all 20k including the unchanged 19 999.
(The table is of course paged and shows max N at once but I still need all 20k to support search and filtering, which is impractical to do server side for various reasons)
The solution I can think of is to set caching headers for /api/employees so that it is cached for e.g. one hour and have another endpoint, /api/employees?changedSince= and somehow ensure that server knows which employees have been changed. But I am sure somebody has already implemented a solution(s) for this...
Thank you!

A timestamp solution would be the best, and simplest, way to implement it. It would only require a small amount of extra data to be stored and would provide the most maintainable and expandable solution.
All you would need to do is update the timestamp when an item in the list is updated. Then, when the page loads for the first time, access /api/employees, then periodically request /api/employees?changedSince to return all of the changed rows in the table, for React to then update.
In terms of caching the main /api/employees endpoint, I’m not sure how much benefit you would gain from doing that, but it depends on how often the data is updated.

As you are saying your a in control of the frontends backend, imho this backend should cache all of the upstream data in its own (SQL or whatever) database. The backend then can expose a proper api (with pagination and search).
The backend can also implement some logic to identify which rows have changed.
If the frontend needs live updates about changes you can use some technology that allows bi-directional communication (SignalR if your backend is .NET based, or something like socket.io if you have a node backend, or even plain websockets)

Is it okay to have dataloader and rest api caching?

I buiding a graphQL server to wrap a multiple restful APIs. Some of the APIs that i will be integrating are third party and some that we own. We use redis as caching layer. Is it okay if i implement dataloader caching on graphQL? Will it have an effect on my existing redis caching?

Dataloader does not only serve one purpose. In fact there are three purposes that dataloader serves.
Caching: You mentioned caching. I am assuming that you are building a GraphQL gateway/proxy in front of your GraphQL API. Caching in this case means that when you need a specific resource and later on you will need it again, you can reach back to the cached value. This caching happens in memory of your JavaScript application and usually does not conflict with any other type of caching e.g. on the network.
Batching: Since queries can be nested quite deeply you will ultimately get to a point where you request multiple values of the same resource type in different parts of your query execution. Dataloader will basically collect them and resolve resources like a cascade. Requests flow into a queue and are kept there until the execution cycle ends. Then they are all "released" at once (and probably can be resolved in batches). Also the delivered Promises are all resolved at once (even if some results come in earlier than others). This allows the next execution level to also happen within one cycle.
Deduplication: Let's say you fetch a list of BlogPost with a field author of type User. In this list multiple blog posts have been written by the same author. When the same key is requested twice it will only be delivered once to the batch function. Dataloader will then take care of delivering the resources back by resolving the repective promises.
The point is that (1) and (3) can be achieved with a decent http client that caches the requests (and not only the responses, that means does not fire another request when one is already running for that resource). This means that the interesting question is if your REST API supports batch requests (e.g. api/user/1,2 in one requests instead of api/user/1 and api/user/2). If so, using dataloader can massively improve the performance of your API.
Maybe you want to look into what Apollo is building right now with their RESTDatasource: https://www.apollographql.com/docs/apollo-server/v2/features/data-sources.html#REST-Data-Source

How to persist large amounts of data through app closure?

I have noticed that apps like instagram keep some data persistent through app closures. Even if all internet connection is removed (perhaps via airplane mode) and the app is closed, reopening it still shows the last loaded data despite the fact that the app cannot call any loading functions from the database. I am curious as to how this is achieved? I would like to implement a similar process into my app (Xcode and swift 4), but I do not know which method is best. I know that NSUserDefaults can persist app data, but I have seen that this is for small and uncomplicated data, of which mine would not be. I know that I can store some of the data in an internal SQL db, via FMDB, but some of the data I would like to persist is image data, which I am not sure exactly how to save into SQL. I also know of Core Data but after reading through some of the documentation I have become a bit confused as to whether or not it fits my purpose. Which of these (or others?) would be best?
As an additional question, regardless of which persistence method I choose, I feel as though every time the data is actually loaded from the DB (when internet connection is available), which is in the viewDidLoad, I would need to be updating the data in the persistent storage in case the internet connection drops. I am concerned that this doubling of my writing procedures will slow the app down? Is there any validity to this concern? Or is it unavoidable anyway?

Should I be concerned with the rate of state change in my React Redux app?

I am implementing/evaluating a "real-time" web app using React, Redux, and Websocket. On the server, I have changes occurring to my data set at a rate of about 32 changes per second.
Each change causes an async message to the app using Websocket. The async message initiates a RECEIVE action in my redux state. State changes lead to component rendering.
My concern is that the frequency of state changes will lead to unacceptable load on the client, but I'm not sure how to characterize load against number of messages, number of components, etc.
When will this become a problem or what tools would I use to figure out if it is a problem?
Does the "shape" of my state make a difference to the rendering performance? Should I consider placing high change objects in one entity while low change objects are in another entity?
Should I focus my efforts on batching the change events so that the app can respond to a list of changes rather than each individual change (effectively reducing the rate of change on state)?
I appreciate any suggestions.

Those are actually pretty reasonable questions to be asking, and yes, those do all sound like good approaches to be looking at.
As a thought - you said your server-side data changes are occurring 32 times a second. Can that information itself be batched at all? Do you literally need to display every single update?
You may be interested in the "Performance" section of the Redux FAQ, which includes answers on "scaling" and reducing the number of store subscription updates.
Grouping your state partially based on update frequency sounds like a good idea. Components that aren't subscribed to that chunk should be able to skip updates based on React Redux's built-in shallow equality checks.
I'll toss in several additional useful links for performance-related information and libraries. My React/Redux links repo has a section on React performance, and my Redux library links repo has relevant sections on store change subscriptions and component update monitoring.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight