I'm using flink v1.11.2 and try to sink my protobuf data to hdfs, I get code from document
My Code is Following
val writer = ParquetProtoWriters.forTypeWithConf(classOf[RawSample], CompressionCodecName.GZIP)
val sinker = StreamingFileSink
.forBulkFormat(new Path(option.dumpOutputPath), writer)
.withBucketAssigner(new DateTimeBucketAssigner[RawSample]("yyyy-MM-dd/HH"))
.withRollingPolicy(OnCheckpointRollingPolicy.build())
.withBucketCheckInterval(option.rolloverInterval)
.withOutputFileConfig(OutputFileConfig.builder().withPartSuffix(".gz.parquet").build())
.build()
I copies ParquetProtoWriters code to support gzip compress, RawSample is a protobuf-generated class,and it did sink file to hdfs, but filename looks like
└── 2019-08-25/12
├── part-0-0.gz.parquet
├── part-0-1.gz.parquet
├── ...
├── part-0-9.gz.parquet
└── 2019-08-25/13
├── part-0-10.gz.parquet
├── part-0-11.gz.parquet
├── ...
├── part-0-19.gz.parquet
└── 2019-08-25/14
├── part-0-20.gz.parquet
├── part-0-21.gz.parquet
├── ...
├── part-0-29.gz.parquet
The field partFileIndex from Part file configuration of part file keeps growing, it's there any way that I can keep it start from 0 for every hour and make it looks like
└── 2019-08-25/12
├── part-0-0.gz.parquet
├── part-0-1.gz.parquet
├── ...
├── part-0-9.gz.parquet
└── 2019-08-25/13
├── part-0-0.gz.parquet
├── part-0-1.gz.parquet
├── ...
├── part-0-9.gz.parquet
└── 2019-08-25/14
├── part-0-0.gz.parquet
├── part-0-1.gz.parquet
├── ...
├── part-0-9.gz.parquet
Unfortunately, no, partFileIndex is incremented globally to prevent duplication of file names
Related
Nextjs - first level dynamic routing for many pages each of which has children
I have a NextJS 12.2/React 18 app linked to a CMS, where users can set up their custom page structure, content & theme.
In the front-end, there will be some API calls to get the app
settings and configuration including the page hierarchy.
The default six top-level page names are homepage, content, events,
user, search and blog.
We can think of these six pages as types, so we have six different
types, and users can change the names of these types but not the type
itself. And as you may expect, changing the names will change the URL
segment accordingly.
Four of those six types can have child-page(s), multi-level (nested).
Here are some examples of what the default app's URLs can look like:
mydomain.com/homepage
mydomain.com/content/
mydomain.com/events/by-location/[locationID]
mydomain.com/events/[eventID]
mydomain.com/events/by-category/[catLevelOne]
mydomain.com/user/profile
mydomain.com/blog/[postID]
mydomain.com/search?term=searchterm
mydomain.com/contact-us
Notice: there is also a static page "contact-us" which is not connected to the CMS.
To achieve this model within the NextJS methodology of routing, the app file structure should look like this (if it is allowed):
pages
├── [blog]
│ ├── [postID].js
│ └── index.js
├── [content]
│ ├── [levelOne]
│ │ ├── [levelTwo]
│ │ │ ├── [levelThree]
│ │ │ │ └── index.js
│ │ │ └── index.js
│ │ └── index.js
│ └── index.js
├── [events]
│ ├── by-category
│ │ ├── [catLevelOne]
│ │ │ ├── [catLevelTwo]
│ │ │ │ └── index.js
│ │ │ └── index.js
│ │ └── index.js
│ ├── by-location
│ │ ├── [location]
│ │ │ ├── [locationID].js
│ │ │ └── index.js
│ │ └── index.js
│ ├── index.js
│ └── [eventID].js
├── [user]
│ ├── account.js
│ ├── emailverification.js
│ ├── password.js
│ └── register.js
├── search.js
└── homepage.js
And this is not possible as we cannot have two folders with brackets on the same level!
If I hard-code file names of the top-level pages, they worked just fine, however, the problem starts when the user changes their default names (homepage, content, events, ...etc) into something else, NextJS then will have no idea what the pages' structure is yet so it through a 404 not found straight away, as getting the file structure from the API will take few seconds to fetch.
What I have tried:
middleware: it turned out that we can't fetch data inside middleware.js and if we did, there will be tens of API calls each time the route changes.
I tried the one dynamic route method, so I fetched the data server-side inside a [wildcard].js file and then based on the data that came back I rewrote the URL to route the requests in the right direction based on the pages types, which is basically the hard-coded page names. That works smoothly but for the top pages not for their children.
I know I am late to the party but in case someone still needs this, I think you were very close. Using shallow-routing with your second approach should do the job. Basically add [...wildcard].js and
based on the data that comes back, rewrote the URL to route the requests in the right direction based on the pages types
A good idea would be to do this in the getServerSideProps function since you can make requests there, it will be faster, and you don't have to worry about any glitch when the page loads.
Also in case anyone has any doubts:
Of course we can't have two folders with brackets on the same level because if you have [id1] and [id2] how will the nextJS know what mydomain.com/1234 should be? There is now way to tell if 1234 is id1 or id2.
I think the last part is probably obvious, but still... In case anyone wonders.
I am using React in the frontend development and doing client-side routing with react-router. While intergating with my warp backend, I have come across some obstacles.
After building the React app with npm run build, I move the build folder to my Rust project. According to create-react-app documentation. I need to serve build folder and serve the index.html file for any matching GET request. I could not achieve this in warp like the express example in the documentation.
Here is the build folder example.
build
├── asset-manifest.json
├── favicon.ico
├── index.html
├── manifest.json
├── robots.txt
└── static
├── css
│ ├── main.089e2544.css
│ └── main.089e2544.css.map
└── js
├── main.ba6a006a.js
├── main.ba6a006a.js.LICENSE.txt
└── main.ba6a006a.js.map
3 directories, 10 files
Here is index.html line where includes the script.
<script defer="defer" src="/static/js/main.ba6a006a.js"></script>
Using warp::fs::dir("build") was enough to see main page since it includes index.html file in the background. But if I to manually type URL for example 127.0.0.1:8080/login and press enter it does not process the request.
The way to implement a "fallback" is to simply use .or() which will attempt to use the next filter if the one before didn't match. So if the required behavior is to serve from the "build" directory or else serve "build/index.html", that can be done like this:
use warp::Filter;
#[tokio::main]
async fn main() {
let routes = warp::filters::fs::dir("build")
.or(warp::filters::fs::file("build/index.html"));
warp::serve(routes)
.run(([127, 0, 0, 1], 8080))
.await;
}
I'm trying to find my way in setting up an efficient web development workflow even though I am new to React and bundlers (and the combi of these two). My goal is to have each React component use its own SCSS file, which can't be accessed by other components' stylesheets. There needs to be some global files though, like a variables.scss, which is where the problem lies.
This is how my file structure (from the src folder) looks at the moment:
├── assets
├── components
│ ├── App
│ │ ├── index.js
│ │ └── index.scss
│ ├── Taskbar
│ │ ├── index.js
│ │ └── index.scss
│ └── Window
│ ├── index.js
│ └── index.scss
├── index.js
└── sass
├── mixins.scss
└── variables.scss
I'm using Parcel with node-sass to import these SCSS files. As you can see, each component has its own folder with a SCSS file in it.
Each component loads its designated stylesheet like this:
import React from 'react';
import './index.scss';
import Taskbar from '../Taskbar';
export default class App extends React.Component {
render() {
return (
<div id="app">
<Taskbar />
</div>
)
}
}
I like to keep this approach, as the stylesheets are only used if the components themselves are imported.
The problem
If you take one more look at my file structure, there are two files in the sass folder (variables.scss and mixins.scss). I want all the files in that folder to be accessible to all components.
I've tried to just plain simple import the files in the App, so it can be accessed in all files, like so:
import '../../sass/variables.scss';
import '../../sass/mixins.scss';
These do load of course, but I can't use the variables and functions declared in these files in, for example, Taskbar/index.scss.
So that would make my question: How do I use these global sass variables/functions in "protected" SCSS files?
Welcome to stackoverflow! Great first post. Very informative, clear and to the point.
Ideally your shared SCSS will be partials and you will use the #import syntax instead of import.
From the Sass guide:
You can create partial Sass files that contain little snippets of CSS
that you can include in other Sass files. This is a great way to
modularize your CSS and help keep things easier to maintain. A
partial is simply a Sass file named with a leading underscore. You
might name it something like _partial.scss. The underscore lets Sass
know that the file is only a partial file and that it should not be
generated into a CSS file. Sass partials are used with the #import
directive.
Working example:
Some additional notes...
A subjectively "better" folder structure (see notes below for why):
└── src
├── assets
├── components
| |
│ ├── App
| | ├── __tests__
| | | └── App.test.js
│ │ ├── index.js
│ │ └── styles.scss (App.scss)
| |
│ ├── Taskbar
| | ├── __tests__
| | | └── Taskbar.test.js
│ │ ├── index.js
│ │ └── styles.scss (Taskbar.scss)
| |
│ └── Window
| ├── __tests__
| | └── Window.test.js
│ ├── index.js
│ └── styles.scss (Window.scss)
|
├── styles
| ├── _mixins.scss
| ├── _variables.scss
| └── styles.scss (exports all shared partials)
|
└── index.js
Avoid using index.scss for component-level stylesheets because when you start adding tests, you'll confuse Webpack as to which import you want if you just write import Component from "./index" without an extension. This has a 50/50 chance of throwing export is not a class or function errors. As a general rule of thumb, I will either use the same name as the parent folder or use styles and add the .scss extension to the import to differentiate that its unique from a normal .js import.
Optional: You can import partials into a single non-partial file and import that file into each component-level stylesheet. See my example here. This saves some time from writing multiple #import statements over and over for each component-level stylesheet; but, has the disadvantage of importing everything when you may only want one thing.
And... if you're bored and have some time, I go into detail as to why I like this folder structure.
So I have a decent sized angular application. This being my first Ng project, I wanted to keep it simple and had everything in one file app.js when starting out.
var app = angular.module('myApp', []);
app.factory( ... );
app.controller( ... );
...
...
As expected, this has now become hard to manage. So I decided to split the functionality within files and combine all of the files using grunt-concat. This is how my structure looks -
.
├── app.js
├── controllers
│ ├── address.js
│ ├── delivery.js
│ ├── editaddress.js
│ ├── login.js
│ └── newaddress.js
├── filters
│ └── filters.js
└── services
├── address_service.js
├── cartservice.js
├── constants.js
├── services.js
└── transformrequest.js
This has worked out quite well and its easy to manage. However, I'm quite at a loss on how to organize for testing.
For example, filters were earlier defined as
angular.module("app.services", [])
.factory('CartService', function() {
})
.factory(...)
This allowed me to write karma tests properly. Now that filters are across files, and adding
angular.module("app.services", []) in each gives an error - I am not sure how to organize my code across files and make it testable.
I looked at angular-seed, but even they have only one file for controllers, services etc.
So my question - how can I organize my code across multiple files and still make it testable in Karma.
Thanks a lot!
I am separating my files very similar directory structure like yours.
I am define my app in app.js:
angular.module('myApp', []);
Then in the other files you don't have to use the "app" instance. You can reference it this way:
angular.module('myApp').factory( ... );
angular.module('myApp').controller( ... );
If you use this method, you don't have to concatenate files, you can include them one by one in your html or in karma tests.
I use my camra.conf.js like this:
...
files: [
'lib/angular/angular.js',
'lib/angular-mocks/angular-mocks.js',
'src/app.js',
'src/**/*.js',
'tests/**/*.js'
],
...
I have a multiple page backbone app based off of this example: https://github.com/asciidisco/grunt-requirejs/tree/master/examples/multipage-shim and it is working fine for the base url. The problem comes when I navigate to a page that is no longer at the root of the domain.
The directory structure looks like this:
scripts
├── app
│ ├── controller
│ │ ├── Base.js
│ │ ├── c1.js
│ │ └── c2.js
│ ├── lib.js
│ ├── main1.js
│ ├── main2.js
│ ├── model
│ │ ├── Base.js
│ │ ├── m1.js
│ │ └── m2.js
├── common.js
├── page1.js
└── page2.js
So, e.g. if I navigate to http://localhost/, everything loads correctly with the following script tag:
<script data-main="/scripts/page1" src="/path/to/require.js">
(This loads page1, which in turn loads common.js and main1.js).
However, if I navigate to http://localhost/another/url/, then the same script tag successfully loads page1.js and common.js, however when it tries to load main1.js, I get a 404, because it is loading from a relative URL (trying to load http://localhost/another/url/scripts/app/main1.js.
My baseUrl is set to 'scripts', and I am building using grunt (https://github.com/asciidisco/grunt-requirejs).
The contents of page1.js is just this:
//Load common code that includes config, then load the app logic for this page.
require(['./common'], function (common) {
require(['app/main1']);
});
For anyone stumbling upon this question, I found this workaround:
require.js supports having two separate baseUrl parameters, one for the build step, and one to be used by the deployed javascript.
By setting build baseUrl: 'scripts' and the deployed baseUrl: '/scripts' I was able to ensure that require always tries to fetch scripts from the root on the server.